NCBI Sebastes umbrosus Annotation Release 100

The RefSeq genome records for Sebastes umbrosus were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Sebastes umbrosus Annotation Release 100

Annotation release ID: 100
Date of Entrez queries for transcripts and proteins: Nov 10 2020
Date of submission of annotation to the public databases: Nov 19 2020
Software version: 8.5

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
fSebUmb1.pri	GCF_015220745.1	Vertebrate Genomes Project	11-03-2020	Reference	24 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	fSebUmb1.pri
Genes and pseudogenes	30,334
protein-coding	23,881
non-coding	5,812
transcribed pseudogenes	0
non-transcribed pseudogenes	345
genes with variants	10,721
immunoglobulin/T-cell receptor gene segments	296
other	0
mRNAs	50,695
fully-supported	49,602
with > 5% ab initio	444
partial	128
with filled gap(s)	0
known RefSeq (NM_)	0
model RefSeq (XM_)	50,695
non-coding RNAs	8,206
fully-supported	6,561
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	7,057
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	50,991
fully-supported	49,602
with > 5% ab initio	513
partial	134
with major correction(s)	665
known RefSeq (NP_)	0
model RefSeq (XP_)	50,695

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	29,693	20,195	8,790	57	1,187,645
All transcripts	58,901	3,374	2,689	46	96,429
mRNA	50,695	3,745	2,985	249	96,429
misc_RNA	1,457	3,006	2,527	182	17,057
tRNA	1,149	74	73	71	97
lncRNA	5,104	850	594	46	9,308
snoRNA	240	116	101	57	335
snRNA	177	141	141	57	191
guide_RNA	8	235	281	130	359
rRNA	71	223	119	119	3,928
Single-exon transcripts	930	1,893	1,722	249	10,706
coding transcripts (NM_/XM_ )	930	1,893	1,722	249	10,706
CDSs	50,695	2,349	1,653	96	95,136
Exons	306,423	290	140	1	17,313
in coding transcripts (NM_/XM_ )	290,028	290	141	1	17,313
in non-coding transcripts (NR_/XR_ )	27,396	242	132	2	8,077
Introns	272,256	2,366	588	30	1,183,407
in coding transcripts (NM_/XM_ )	261,212	2,221	569	30	1,183,407
in non-coding transcripts (NR_/XR_ )	21,942	4,088	901	30	1,010,743

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	2.02	1	1	50
Number of exons per transcript	13.54	10	1	252

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 23881 coding genes, 21967 genes had a protein with an alignment covering 50% or more of the query and 10173 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
fSebUmb1.pri	GCF_015220745.1	3.78%	35.27%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

No transcript evidence was used in this annotation

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	4,609,267,543	86%	34%	366,418
SAMN02712569	NA	Ovary (Sebastes saxicola, female, SAMN02712569)	311,289	63%	48%	46,214
SAMN03757544	NA	white muscle (Sebastes caurinus, pooled male and female, SAMN03757544)	642,971,003	91%	36%	227,864
SAMN05292520	30602025	Juvenile/Adult, brain (Sebastes mystinus, SAMN05292520)	68,979,274	70%	14%	137,390
SAMN05292521	30602025	Juvenile/Adult, brain (Sebastes serranoides, SAMN05292521)	70,630,730	68%	13%	153,234
SAMN05292522	30602025	Juvenile/Adult, brain (Sebastes nebulosus, SAMN05292522)	76,641,289	75%	12%	179,204
SAMN05292523	30602025	Juvenile, brain (Sebastes carnatus, SAMN05292523)	87,552,852	71%	11%	173,387
SAMN05292524	30602025	Juvenile/Adult, brain (Sebastes maliger, SAMN05292524)	87,824,518	75%	14%	181,759
SAMN05893354	NA	liver (Sebastes schlegelii, male, SAMN05893354)	150,150,804	90%	37%	183,886
SAMN05893355	NA	liver (Sebastes schlegelii, male, SAMN05893355)	138,796,222	88%	34%	173,569
SAMN05893357	NA	liver (Sebastes schlegelii, male, SAMN05893357)	115,833,528	88%	36%	171,706
SAMN10591520	NA	liver (Sebastes schlegelii, 2, female, SAMN10591520)	41,767,780	89%	50%	172,900
SAMN10591521	NA	muscle (Sebastes schlegelii, 2, female, SAMN10591521)	44,174,760	93%	53%	164,461
SAMN10591522	NA	ovary (Sebastes schlegelii, 2, female, SAMN10591522)	45,743,788	78%	40%	191,967
SAMN10591523	NA	eye (Sebastes schlegelii, 2, female, SAMN10591523)	42,502,334	90%	38%	229,805
SAMN10591524	NA	skin,gill,heart,intestines,stomach (Sebastes schlegelii, 2, male, SAMN10591524)	57,885,110	91%	45%	222,633
SAMN10591525	NA	testis (Sebastes schlegelii, 2, male, SAMN10591525)	45,071,260	86%	35%	242,993
SAMN10591526	NA	brain (Sebastes schlegelii, 2, male, SAMN10591526)	41,580,230	88%	33%	238,691
SAMN10591527	NA	spleen (Sebastes schlegelii, 2, male, SAMN10591527)	43,790,754	88%	38%	208,436
SAMN10591528	NA	kidney (Sebastes schlegelii, 2, male, SAMN10591528)	47,384,780	89%	41%	225,252
SAMN11104267	NA	juvenile, brain (Sebastes mystinus, <1 year, SAMN11104267)	111,241,164	72%	32%	233,700
SAMN11104268	NA	juvenile, liver (Sebastes mystinus, <1 year, SAMN11104268)	132,514,310	76%	46%	172,213
SAMN11104269	NA	juvenile, white muscle (Sebastes mystinus, <1 year, SAMN11104269)	74,361,240	72%	57%	147,180
SAMN11104270	NA	juvenile, gill (Sebastes mystinus, <1 year, SAMN11104270)	116,987,704	72%	40%	226,079
SAMN11104271	NA	juvenile, gill (Sebastes mystinus, <1 year, SAMN11104271)	78,922,930	77%	47%	213,364
SAMN11104272	NA	juvenile, white muscle (Sebastes mystinus, <1 year, SAMN11104272)	94,151,766	87%	65%	160,516
SAMN11104273	NA	juvenile, white muscle (Sebastes mystinus, <1 year, SAMN11104273)	89,558,976	75%	57%	154,290
SAMN11104274	NA	juvenile, gill (Sebastes mystinus, <1 year, SAMN11104274)	99,083,794	68%	36%	210,898
SAMN11280741	31533071	T2_Mys7077_2wk_cross (Sebastes mystinus, SAMN11280741)	11,306,869	93%	17%	67,876
SAMN11280742	31533071	T2_Mys7076_2wk_cross (Sebastes mystinus, SAMN11280742)	17,348,220	92%	16%	80,421
SAMN11280743	31533071	T2_Mys7071_24h_cross (Sebastes mystinus, SAMN11280743)	13,435,403	93%	15%	71,395
SAMN11280744	31533071	T2_Mys7070_24h_cross (Sebastes mystinus, SAMN11280744)	14,951,744	93%	16%	66,524
SAMN11280745	31533071	T2_Mys7069_24h_cross (Sebastes mystinus, SAMN11280745)	15,196,808	93%	16%	75,835
SAMN11280746	31533071	T2_Mys7068_24h_cross (Sebastes mystinus, SAMN11280746)	15,144,745	92%	15%	98,712
SAMN11280747	31533071	T2_Mys7067_12h_cross (Sebastes mystinus, SAMN11280747)	11,388,832	94%	16%	54,058
SAMN11280748	31533071	T2_Mys7066_12h_cross (Sebastes mystinus, SAMN11280748)	19,693,491	93%	16%	84,308
SAMN11280749	31533071	T2_Mys7065_12h_cross (Sebastes mystinus, SAMN11280749)	35,981,427	92%	15%	99,681
SAMN11280750	31533071	T2_Mys7064_12h_cross (Sebastes mystinus, SAMN11280750)	32,267,458	91%	14%	97,670
SAMN11280751	31533071	T2_Mys60_2wk_DO (Sebastes mystinus, SAMN11280751)	15,694,820	93%	16%	72,263
SAMN11280752	31533071	T2_Mys59_2wk_DO (Sebastes mystinus, SAMN11280752)	15,865,426	93%	16%	70,628
SAMN11280753	31533071	T2_Mys58_2wk_DO (Sebastes mystinus, SAMN11280753)	17,692,314	91%	16%	74,440
SAMN11280754	31533071	T2_Mys57_2wk_DO (Sebastes mystinus, SAMN11280754)	15,519,207	93%	14%	63,499
SAMN11280755	31533071	T2_Mys54_2wk_pH (Sebastes mystinus, SAMN11280755)	13,809,824	93%	16%	65,071
SAMN11280756	31533071	T2_Mys53_2wk_pH (Sebastes mystinus, SAMN11280756)	12,931,448	92%	17%	44,468
SAMN11280757	31533071	T2_Mys52_2wk_control (Sebastes mystinus, SAMN11280757)	18,614,315	93%	16%	85,748
SAMN11280758	31533071	T2_Mys50_2wk_control (Sebastes mystinus, SAMN11280758)	16,357,180	92%	14%	77,309
SAMN11280759	31533071	T2_Mys49_2wk_control (Sebastes mystinus, SAMN11280759)	13,440,498	93%	15%	68,229
SAMN11280760	31533071	T2_Mys36_24h_DO (Sebastes mystinus, SAMN11280760)	14,832,363	93%	16%	65,450
SAMN11280761	31533071	T2_Mys51_2wk_control (Sebastes mystinus, SAMN11280761)	16,229,708	93%	16%	80,816
SAMN11280762	31533071	T2_Mys35_24h_DO (Sebastes mystinus, SAMN11280762)	20,367,295	91%	14%	106,363
SAMN11280763	31533071	T2_Mys34_24h_DO (Sebastes mystinus, SAMN11280763)	18,175,850	90%	16%	64,066
SAMN11280764	31533071	T2_Mys32_24h_pH (Sebastes mystinus, SAMN11280764)	18,587,147	93%	17%	72,938
SAMN11280765	31533071	T2_Mys31_24h_pH (Sebastes mystinus, SAMN11280765)	17,416,433	92%	15%	79,593
SAMN11280766	31533071	T2_Mys30_24h_pH (Sebastes mystinus, SAMN11280766)	15,747,852	93%	16%	62,644
SAMN11280767	31533071	T2_Mys28_24h_control (Sebastes mystinus, SAMN11280767)	10,297,933	92%	15%	67,712
SAMN11280768	31533071	T2_Mys27_24h_control (Sebastes mystinus, SAMN11280768)	14,670,899	92%	15%	79,594
SAMN11280769	31533071	T2_Mys26_24h_control (Sebastes mystinus, SAMN11280769)	15,798,612	92%	15%	76,293
SAMN11280770	31533071	T2_Mys25_24h_control (Sebastes mystinus, SAMN11280770)	24,802,340	92%	14%	69,796
SAMN11280771	31533071	T2_Mys24_12h_DO (Sebastes mystinus, SAMN11280771)	14,579,059	92%	16%	62,936
SAMN11280772	31533071	T2_Mys23_12h_DO (Sebastes mystinus, SAMN11280772)	14,378,646	93%	16%	64,154
SAMN11280773	31533071	T2_Mys22_12h_DO (Sebastes mystinus, SAMN11280773)	26,234,935	93%	16%	77,404
SAMN11280774	31533071	T2_Mys21_12h_DO (Sebastes mystinus, SAMN11280774)	17,634,252	93%	16%	69,583
SAMN11280775	31533071	T2_Mys20_12h_pH (Sebastes mystinus, SAMN11280775)	18,113,019	93%	16%	73,244
SAMN11280776	31533071	T2_Mys19_12h_pH (Sebastes mystinus, SAMN11280776)	38,169,202	93%	15%	98,012
SAMN11280777	31533071	T2_Mys18_12h_pH (Sebastes mystinus, SAMN11280777)	19,349,035	90%	15%	79,878
SAMN11280778	31533071	T2_Mys17_12h_pH (Sebastes mystinus, SAMN11280778)	19,340,677	91%	16%	73,614
SAMN11280779	31533071	T2_Mys16_12h_control (Sebastes mystinus, SAMN11280779)	13,449,926	93%	16%	65,950
SAMN11280780	31533071	T2_Mys15_12h_control (Sebastes mystinus, SAMN11280780)	13,783,923	88%	16%	58,461
SAMN11280781	31533071	T2_Mys14_12h_control (Sebastes mystinus, SAMN11280781)	17,630,260	92%	15%	78,889
SAMN11280782	31533071	T2_Mys13_12h_control (Sebastes mystinus, SAMN11280782)	12,727,920	93%	16%	66,860
SAMN11280786	31533071	T2_Mys7079_2wk_cross (Sebastes mystinus, SAMN11280786)	18,282,209	93%	16%	76,536
SAMN11280787	31533071	T2_Mys7078_2wk_cross (Sebastes mystinus, SAMN11280787)	18,820,794	93%	15%	77,477
SAMN12404993	NA	stage III, gonad (Sebastes schlegelii, 3 years, female, SAMN12404993)	53,232,180	93%	46%	218,335
SAMN12404994	NA	stage III, gonad (Sebastes schlegelii, 3 years, female, SAMN12404994)	59,699,926	92%	45%	220,499
SAMN12404995	NA	stage III, gonad (Sebastes schlegelii, 3 years, female, SAMN12404995)	61,291,190	93%	43%	219,911
SAMN12404996	NA	stage IV, gonad (Sebastes schlegelii, 3 years, female, SAMN12404996)	54,293,852	93%	44%	216,103
SAMN12404997	NA	stage IV, gonad (Sebastes schlegelii, 3 years, female, SAMN12404997)	54,847,752	92%	42%	227,448
SAMN12404998	NA	stage IV, gonad (Sebastes schlegelii, 3 years, female, SAMN12404998)	63,343,660	92%	44%	232,175
SAMN12404999	NA	stage V, gonad (Sebastes schlegelii, 3 years, female, SAMN12404999)	54,664,132	90%	43%	236,734
SAMN12405000	NA	stage V, gonad (Sebastes schlegelii, 3 years, female, SAMN12405000)	53,942,574	91%	44%	228,165
SAMN12405001	NA	stage V, gonad (Sebastes schlegelii, 3 years, female, SAMN12405001)	55,317,942	91%	44%	231,345
SAMN12405002	NA	stage III, gonad (Sebastes schlegelii, 3 years, male, SAMN12405002)	63,606,858	88%	40%	254,450
SAMN12405003	NA	stage III, gonad (Sebastes schlegelii, 3 years, male, SAMN12405003)	60,904,392	87%	38%	232,662
SAMN12405004	NA	stage III, gonad (Sebastes schlegelii, 3 years, male, SAMN12405004)	59,534,938	86%	37%	226,992
SAMN12405005	NA	stage IV, gonad (Sebastes schlegelii, 3 years, male, SAMN12405005)	65,114,924	89%	43%	255,628
SAMN12405006	NA	stage IV, gonad (Sebastes schlegelii, 3 years, male, SAMN12405006)	64,537,178	88%	44%	254,032
SAMN12405007	NA	stage IV, gonad (Sebastes schlegelii, 3 years, male, SAMN12405007)	58,296,284	87%	43%	252,385
SAMN12405008	NA	stage V, gonad (Sebastes schlegelii, 3 years, male, SAMN12405008)	49,179,320	87%	40%	233,237
SAMN12405009	NA	stage V, gonad (Sebastes schlegelii, 3 years, male, SAMN12405009)	55,299,334	86%	38%	243,794
SAMN12405010	NA	stage V, gonad (Sebastes schlegelii, 3 years, male, SAMN12405010)	42,513,384	87%	38%	239,685
SAMN14563485	NA	adult, skin (Sebastes pachycephalus, one year old, SAMN14563485)	78,454,856	90%	26%	138,902
SAMN14566496	NA	adult, skin (Sebastes pachycephalus, one year old, SAMN14566496)	78,718,360	89%	24%	135,373

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR1212396	SRX506629	SRP040777	SAMN02712569	311,289	63%	48%
SRR2048500	SRX1046597	SRP058987	SAMN03757544	56,101,358	91%	37%
SRR2048501	SRX1046599	SRP058987	SAMN03757544	30,559,613	91%	35%
SRR2048504	SRX1046600	SRP058987	SAMN03757544	35,610,876	90%	38%
SRR2048507	SRX1046605	SRP058987	SAMN03757544	27,398,181	90%	35%
SRR2048508	SRX1046606	SRP058987	SAMN03757544	25,722,882	91%	35%
SRR2048509	SRX1046607	SRP058987	SAMN03757544	61,957,506	90%	38%
SRR2048510	SRX1046608	SRP058987	SAMN03757544	52,316,128	90%	37%
SRR2048511	SRX1046609	SRP058987	SAMN03757544	65,305,406	89%	37%
SRR2048512	SRX1046610	SRP058987	SAMN03757544	30,117,729	91%	35%
SRR2048513	SRX1046611	SRP058987	SAMN03757544	26,023,889	91%	36%
SRR2048514	SRX1046612	SRP058987	SAMN03757544	26,548,801	90%	35%
SRR2048515	SRX1046613	SRP058987	SAMN03757544	27,090,280	92%	33%
SRR2048516	SRX1046614	SRP058987	SAMN03757544	58,227,442	91%	37%
SRR2048517	SRX1046615	SRP058987	SAMN03757544	61,893,046	91%	35%
SRR2048518	SRX1046616	SRP058987	SAMN03757544	58,097,866	90%	36%
SRR3996876	SRX1923021	SRP077939	SAMN05292520	68,979,274	70%	14%
SRR3996877	SRX1997747	SRP077939	SAMN05292521	70,630,730	68%	13%
SRR3996878	SRX1997748	SRP077939	SAMN05292522	76,641,289	75%	12%
SRR3986877	SRX1988984	SRP077939	SAMN05292523	87,552,852	71%	11%
SRR3931334	SRX1960799	SRP077939	SAMN05292524	87,824,518	75%	14%
SRR4409390	SRX2235692	SRP091355	SAMN05893354	150,150,804	90%	37%
SRR4409389	SRX2235704	SRP091355	SAMN05893355	138,796,222	88%	34%
SRR4409372	SRX2235660	SRP091355	SAMN05893357	115,833,528	88%	36%
SRR8316962	SRX5129674	SRP173183	SAMN10591520	41,767,780	89%	50%
SRR8316963	SRX5129673	SRP173183	SAMN10591521	44,174,760	93%	53%
SRR8316964	SRX5129672	SRP173183	SAMN10591522	45,743,788	78%	40%
SRR8316965	SRX5129671	SRP173183	SAMN10591523	42,502,334	90%	38%
SRR8316967	SRX5129669	SRP173183	SAMN10591524	57,885,110	91%	45%
SRR8316968	SRX5129668	SRP173183	SAMN10591525	45,071,260	86%	35%
SRR8316969	SRX5129667	SRP173183	SAMN10591526	41,580,230	88%	33%
SRR8316970	SRX5129666	SRP173183	SAMN10591527	43,790,754	88%	38%
SRR8316966	SRX5129670	SRP173183	SAMN10591528	47,384,780	89%	41%
SRR8717233	SRX5511130	SRP188256	SAMN11104267	111,241,164	72%	32%
SRR8717232	SRX5511131	SRP188256	SAMN11104268	132,514,310	76%	46%
SRR8717235	SRX5511128	SRP188256	SAMN11104269	74,361,240	72%	57%
SRR8717234	SRX5511129	SRP188256	SAMN11104270	116,987,704	72%	40%
SRR8717237	SRX5511126	SRP188256	SAMN11104271	78,922,930	77%	47%
SRR8717236	SRX5511127	SRP188256	SAMN11104272	94,151,766	87%	65%
SRR8717239	SRX5511124	SRP188256	SAMN11104273	89,558,976	75%	57%
SRR8717238	SRX5511125	SRP188256	SAMN11104274	99,083,794	68%	36%
SRR8799861	SRX5588669	SRP189774	SAMN11280741	11,306,869	93%	17%
SRR8799860	SRX5588668	SRP189774	SAMN11280742	17,348,220	92%	16%
SRR8799859	SRX5588667	SRP189774	SAMN11280743	13,435,403	93%	15%
SRR8799858	SRX5588666	SRP189774	SAMN11280744	14,951,744	93%	16%
SRR8799857	SRX5588665	SRP189774	SAMN11280745	15,196,808	93%	16%
SRR8799856	SRX5588664	SRP189774	SAMN11280746	15,144,745	92%	15%
SRR8799855	SRX5588663	SRP189774	SAMN11280747	11,388,832	94%	16%
SRR8799854	SRX5588662	SRP189774	SAMN11280748	19,693,491	93%	16%
SRR8799853	SRX5588661	SRP189774	SAMN11280749	35,981,427	92%	15%
SRR8799852	SRX5588660	SRP189774	SAMN11280750	32,267,458	91%	14%
SRR8799851	SRX5588659	SRP189774	SAMN11280751	15,694,820	93%	16%
SRR8799850	SRX5588657	SRP189774	SAMN11280752	15,865,426	93%	16%
SRR8799849	SRX5588656	SRP189774	SAMN11280753	17,692,314	91%	16%
SRR8799848	SRX5588655	SRP189774	SAMN11280754	15,519,207	93%	14%
SRR8799846	SRX5588654	SRP189774	SAMN11280755	13,809,824	93%	16%
SRR8799845	SRX5588653	SRP189774	SAMN11280756	12,931,448	92%	17%
SRR8799844	SRX5588652	SRP189774	SAMN11280757	18,614,315	93%	16%
SRR8799842	SRX5588650	SRP189774	SAMN11280758	16,357,180	92%	14%
SRR8799841	SRX5588649	SRP189774	SAMN11280759	13,440,498	93%	15%
SRR8799840	SRX5588648	SRP189774	SAMN11280760	14,832,363	93%	16%
SRR8799843	SRX5588651	SRP189774	SAMN11280761	16,229,708	93%	16%
SRR8799839	SRX5588647	SRP189774	SAMN11280762	20,367,295	91%	14%
SRR8799838	SRX5588646	SRP189774	SAMN11280763	18,175,850	90%	16%
SRR8799837	SRX5588645	SRP189774	SAMN11280764	18,587,147	93%	17%
SRR8799836	SRX5588644	SRP189774	SAMN11280765	17,416,433	92%	15%
SRR8799835	SRX5588643	SRP189774	SAMN11280766	15,747,852	93%	16%
SRR8799834	SRX5588641	SRP189774	SAMN11280767	10,297,933	92%	15%
SRR8799833	SRX5588640	SRP189774	SAMN11280768	14,670,899	92%	15%
SRR8799832	SRX5588639	SRP189774	SAMN11280769	15,798,612	92%	15%
SRR8799831	SRX5588638	SRP189774	SAMN11280770	24,802,340	92%	14%
SRR8799830	SRX5588637	SRP189774	SAMN11280771	14,579,059	92%	16%
SRR8799829	SRX5588636	SRP189774	SAMN11280772	14,378,646	93%	16%
SRR8799828	SRX5588635	SRP189774	SAMN11280773	26,234,935	93%	16%
SRR8799827	SRX5588634	SRP189774	SAMN11280774	17,634,252	93%	16%
SRR8799826	SRX5588633	SRP189774	SAMN11280775	18,113,019	93%	16%
SRR8799824	SRX5588632	SRP189774	SAMN11280776	38,169,202	93%	15%
SRR8799823	SRX5588631	SRP189774	SAMN11280777	19,349,035	90%	15%
SRR8799822	SRX5588630	SRP189774	SAMN11280778	19,340,677	91%	16%
SRR8799821	SRX5588629	SRP189774	SAMN11280779	13,449,926	93%	16%
SRR8799820	SRX5588628	SRP189774	SAMN11280780	13,783,923	88%	16%
SRR8799819	SRX5588626	SRP189774	SAMN11280781	17,630,260	92%	15%
SRR8799818	SRX5588625	SRP189774	SAMN11280782	12,727,920	93%	16%
SRR8799863	SRX5588671	SRP189774	SAMN11280786	18,282,209	93%	16%
SRR8799862	SRX5588670	SRP189774	SAMN11280787	18,820,794	93%	15%
SRR10176998	SRX6898384	SRP223139	SAMN12404993	53,232,180	93%	46%
SRR10176997	SRX6898385	SRP223139	SAMN12404994	59,699,926	92%	45%
SRR10176988	SRX6898394	SRP223139	SAMN12404995	61,291,190	93%	43%
SRR10176987	SRX6898395	SRP223139	SAMN12404996	54,293,852	93%	44%
SRR10176986	SRX6898396	SRP223139	SAMN12404997	54,847,752	92%	42%
SRR10176985	SRX6898397	SRP223139	SAMN12404998	63,343,660	92%	44%
SRR10176984	SRX6898398	SRP223139	SAMN12404999	54,664,132	90%	43%
SRR10176983	SRX6898399	SRP223139	SAMN12405000	53,942,574	91%	44%
SRR10176982	SRX6898400	SRP223139	SAMN12405001	55,317,942	91%	44%
SRR10176981	SRX6898401	SRP223139	SAMN12405002	63,606,858	88%	40%
SRR10176996	SRX6898386	SRP223139	SAMN12405003	60,904,392	87%	38%
SRR10176995	SRX6898387	SRP223139	SAMN12405004	59,534,938	86%	37%
SRR10176994	SRX6898388	SRP223139	SAMN12405005	65,114,924	89%	43%
SRR10176993	SRX6898389	SRP223139	SAMN12405006	64,537,178	88%	44%
SRR10176992	SRX6898390	SRP223139	SAMN12405007	58,296,284	87%	43%
SRR10176991	SRX6898391	SRP223139	SAMN12405008	49,179,320	87%	40%
SRR10176990	SRX6898392	SRP223139	SAMN12405009	55,299,334	86%	38%
SRR10176989	SRX6898393	SRP223139	SAMN12405010	42,513,384	87%	38%
SRR11514877	SRX8086488	SRP255789	SAMN14563485	78,454,856	90%	26%
SRR11514881	SRX8086492	SRP255789	SAMN14566496	78,718,360	89%	24%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Larimichthys crocea high-quality model RefSeq (XP_)	18,161	17,744 (97.70%)	17,744 (97.70%)	72.63%	82.49%
Actinopterygii GenBank	86,840	53,375 (61.46%)	53,375 (61.46%)	68.90%	81.62%
Actinopterygii known RefSeq (NP_)	25,473	23,802 (93.44%)	23,802 (93.44%)	68.32%	79.50%
Danio rerio high-quality model RefSeq (XP_)	7,718	7,182 (93.06%)	7,182 (93.06%)	65.23%	75.01%
Perca flavescens high-quality model RefSeq (XP_)	16,027	15,561 (97.09%)	15,561 (97.09%)	73.92%	83.99%
Homo sapiens known RefSeq (NP_)	60,899	38,857 (63.81%)	38,857 (63.81%)	67.00%	70.61%

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences