NCBI Strongylocentrotus purpuratus Annotation Release 102

The RefSeq genome records for Strongylocentrotus purpuratus were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Strongylocentrotus purpuratus Annotation Release 102

Annotation release ID: 102
Date of Entrez queries for transcripts and proteins: Sep 18 2019
Date of submission of annotation to the public databases: Sep 27 2019
Software version: 8.2

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Spur_5.0	GCF_000002235.5	Baylor College of Medicine	09-06-2019	Reference	1 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Spur_5.0
Genes and pseudogenes	33,503
protein-coding	27,447
non-coding	5,500
transcribed pseudogenes	10
non-transcribed pseudogenes	546
genes with variants	6,115
immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	38,426
fully-supported	31,214
with > 5% ab initio	4,947
partial	479
with filled gap(s)	103
known RefSeq (NM_)	390
model RefSeq (XM_)	38,036
non-coding RNAs	6,742
fully-supported	4,870
with > 5% ab initio	0
partial	1
with filled gap(s)	1
known RefSeq (NR_)	62
model RefSeq (XR_)	5,565
pseudo transcripts	10
fully-supported	10
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	10
CDSs	38,439
fully-supported	31,214
with > 5% ab initio	5,277
partial	422
with major correction(s)	445
known RefSeq (NP_)	403
model RefSeq (XP_)	38,036

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	32,947	13,691	8,161	57	300,967
All transcripts	45,168	3,048	2,299	20	74,641
mRNA	38,426	3,386	2,588	99	74,641
misc_RNA	688	3,636	3,373	122	17,116
miRNA	69	22	22	20	27
tRNA	1,104	74	73	65	84
lncRNA	4,132	1,174	827	50	12,733
snoRNA	121	116	97	69	245
snRNA	296	149	162	63	198
rRNA	332	217	119	117	3,764
Single-exon transcripts	1,855	1,896	1,365	284	13,874
coding transcripts (NM_/XM_ )	1,855	1,896	1,365	284	13,874
CDSs	38,439	2,059	1,440	99	72,141
Exons	258,355	338	152	2	18,487
in coding transcripts (NM_/XM_ )	245,575	335	152	2	18,487
in non-coding transcripts (NR_/XR_ )	15,558	371	151	2	10,479
Introns	226,731	1,818	838	26	131,280
in coding transcripts (NM_/XM_ )	218,005	1,808	836	26	131,280
in non-coding transcripts (NR_/XR_ )	11,418	2,025	885	30	30,152

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.38	1	1	50
Number of exons per transcript	9.98	6	1	246

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 27434 coding genes, 18541 genes had a protein with an alignment covering 50% or more of the query and 3898 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Spur_5.0	GCF_000002235.5	19.29%	39.20%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	491	488 (99.39%)	437 (89.00%)	99.03%	97.86%
Same-species Genbank	1,396	1,385 (99.21%)	1,217 (87.18%)	98.76%	98.57%
Same-species EST	141,809	96,319 (67.92%)	65,918 (46.48%)	97.88%	97.74%
Echinoidea Genbank	1,275	797 (62.51%)	266 (20.86%)	91.64%	92.96%
Echinoidea EST	144,525	23,808 (16.47%)	12,616 (8.73%)	88.82%	95.50%

RefSeq transcript alignment quality report

The known RefSeq transcripts (NM_ and NR_ accessions) are a set of hiqh-quality transcripts maintained by the RefSeq group at NCBI. Alignment statistics for this group of transcripts, such as percent and number of sequences not aligning at all, percent best alignments split between multiple scaffolds, and percent alignments not covering the full CDS are indicative of the genome quality and are provided below.

	Spur_5.0 Primary Assembly
Number of sequences retrieved from Entrez	491
Number (%) of sequences not aligning	3 (0.61%)
Number (%) of sequences with multiple best alignments (split genes)	6 (1.23%)
Number (%) of sequences with CDS coverage < 95%	39 (9.15%)

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	6,252,308,113	62%	13%	260,175
SAMN01096960	22709795,24291147	sea urchin, young juvenile (Strongylocentrotus purpuratus, SAMN01096960)	76,613,634	62%	11%	190,756
SAMN01103178	22709795,24291147	sea urchin embryos 24hpf (Strongylocentrotus purpuratus, SAMN01103178)	80,818,328	71%	19%	177,094
SAMN01103179	22709795,24291147	sea urchin embryos 18hpf (Strongylocentrotus purpuratus, SAMN01103179)	93,646,010	70%	15%	173,218
SAMN01103180	22709795,24291147	sea urchin embryos 10hpf (Strongylocentrotus purpuratus, SAMN01103180)	89,348,844	67%	15%	165,826
SAMN01103181	22709795,24291147	sea urchin embryos 72hpf (Strongylocentrotus purpuratus, SAMN01103181)	57,250,686	74%	19%	190,663
SAMN01103182	22709795,24291147	sea urchin, adult tissue, axial gland (Strongylocentrotus purpuratus, SAMN01103182)	79,791,290	56%	10%	158,339
SAMN01103183	22709795,24291147	sea urchin embryos 0hpf (Strongylocentrotus purpuratus, SAMN01103183)	47,355,442	70%	21%	142,262
SAMN01103184	22709795,24291147	sea urchin, adult tissue, coelomocyte (Strongylocentrotus purpuratus, SAMN01103184)	79,038,724	61%	11%	184,151
SAMN01103185	22709795,24291147	sea urchin larva, tube-foot protrusion stage (Strongylocentrotus purpuratus, SAMN01103185)	80,286,226	57%	12%	199,050
SAMN01103186	22709795,24291147	sea urchin embryos 56hpf (Strongylocentrotus purpuratus, SAMN01103186)	47,021,316	76%	23%	175,143
SAMN01103187	22709795,24291147	sea urchin, adult tissue, gut (Strongylocentrotus purpuratus, SAMN01103187)	81,418,680	60%	11%	191,900
SAMN01103188	22709795,24291147	sea urchin embryos 40hpf (Strongylocentrotus purpuratus, SAMN01103188)	56,244,358	76%	21%	181,024
SAMN01103189	22709795,24291147	sea urchin, post-metamorphosis (Strongylocentrotus purpuratus, SAMN01103189)	76,017,750	64%	10%	175,353
SAMN01103190	22709795,24291147	sea urchin, adult tissue, ovary (Strongylocentrotus purpuratus, SAMN01103190)	83,125,690	74%	17%	174,415
SAMN01103191	22709795,24291147	sea urchin embryos 48hpf (Strongylocentrotus purpuratus, SAMN01103191)	47,145,708	74%	21%	180,631
SAMN01103192	22709795,24291147	sea urchin embryos 64hpf (Strongylocentrotus purpuratus, SAMN01103192)	66,574,062	69%	17%	186,648
SAMN01103193	22709795,24291147	sea urchin, adult tissue, radial nerve (Strongylocentrotus purpuratus, SAMN01103193)	72,678,758	64%	8%	174,789
SAMN01103194	22709795,24291147	sea urchin larva, vestibular invagination stage (Strongylocentrotus purpuratus, SAMN01103194)	92,917,150	73%	19%	196,171
SAMN01103195	22709795,24291147	sea urchin embryos 30hpf (Strongylocentrotus purpuratus, SAMN01103195)	30,751,700	71%	17%	162,949
SAMN01103196	22709795,24291147	sea urchin, adult tissue, testes (Strongylocentrotus purpuratus, SAMN01103196)	78,345,470	64%	10%	179,725
SAMN01103197	22709795,24291147	sea urchin larva, pentagonal disc stage (Strongylocentrotus purpuratus, SAMN01103197)	81,975,694	57%	12%	190,444
SAMN01103198	22709795,24291147	sea urchin larva, four-arm stage (Strongylocentrotus purpuratus, SAMN01103198)	69,001,808	70%	23%	180,881
SAMN01907389	NA	S purpuratus transcriptome reads (Strongylocentrotus purpuratus, SAMN01907389)	57,554,898	76%	25%	203,351
SAMN02370307	NA	Alx1GFP negative cells, r1 (Strongylocentrotus purpuratus, SAMN02370307)	59,164,594	56%	8%	102,667
SAMN02370308	NA	Alx1GFP positive cells, r1 (Strongylocentrotus purpuratus, SAMN02370308)	52,288,818	47%	9%	96,439
SAMN02370309	NA	TbrGFP negative cells, r1 (Strongylocentrotus purpuratus, SAMN02370309)	78,493,790	57%	7%	122,431
SAMN02370310	NA	TbrGFP positive cells, r1 (Strongylocentrotus purpuratus, SAMN02370310)	82,005,692	50%	7%	64,997
SAMN02370311	NA	TbrGFP negative cells, r2 (Strongylocentrotus purpuratus, SAMN02370311)	40,419,374	53%	10%	127,880
SAMN02370312	NA	TbrGFP positive cells, r2 (Strongylocentrotus purpuratus, SAMN02370312)	38,748,870	45%	8%	130,572
SAMN02371530	NA	Primary Mesenchyme Cells, (Strongylocentrotus purpuratus, 24 hours post fertilization, SAMN02371530)	60,659,290	59%	6%	163,529
SAMN02371531	NA	Primary Mesenchyme Cells, (Strongylocentrotus purpuratus, 24 hours post fertilization, SAMN02371531)	65,417,938	66%	6%	155,064
SAMN02371532	NA	Primary Mesenchyme Cells, (Strongylocentrotus purpuratus, 24 hours post fertilization, SAMN02371532)	65,667,490	70%	7%	165,813
SAMN02371533	NA	Primary Mesenchyme Cells, (Strongylocentrotus purpuratus, 24 hours post fertilization, SAMN02371533)	70,356,744	70%	6%	158,326
SAMN03287632	NA	embryonic, Eve positive cells, 25hpf, r1, (Strongylocentrotus purpuratus, SAMN03287632)	75,895,632	44%	3%	81,951
SAMN03287633	NA	embryonic, Eve negative cells, 25hpf, r1, (Strongylocentrotus purpuratus, SAMN03287633)	73,343,302	46%	6%	108,004
SAMN03287636	NA	embryonic, Gcm positive cells, 45hpf, r1, (Strongylocentrotus purpuratus, SAMN03287636)	184,350,556	55%	9%	71,342
SAMN03287637	NA	embryonic, Gcm negative cells, 45hpf, r1, (Strongylocentrotus purpuratus, SAMN03287637)	91,667,376	48%	11%	146,485
SAMN03287638	NA	embryonic, Gsc positive cells, 35hpf, r1, (Strongylocentrotus purpuratus, SAMN03287638)	61,394,200	55%	12%	130,644
SAMN03287639	NA	embryonic, Gsc negative cells, 35hpf, r1, (Strongylocentrotus purpuratus, SAMN03287639)	56,417,690	55%	11%	153,325
SAMN03287640	NA	embryonic, Onecut positive cells, 35hpf, r1, (Strongylocentrotus purpuratus, SAMN03287640)	53,171,912	60%	11%	158,201
SAMN03287641	NA	embryonic, Onecut negative cells, 35hpf, r1, (Strongylocentrotus purpuratus, SAMN03287641)	35,025,870	58%	12%	152,641
SAMN03287642	NA	embryonic, Onecut positive cells, 35hpf, r3, (Strongylocentrotus purpuratus, SAMN03287642)	55,344,488	51%	10%	136,321
SAMN03287643	NA	embryonic, Onecut negative cells, 35hpf, r3, (Strongylocentrotus purpuratus, SAMN03287643)	83,759,108	60%	10%	168,299
SAMN03287644	NA	embryonic, Lhx2 positive cells, 35hpf, r2, (Strongylocentrotus purpuratus, SAMN03287644)	125,767,076	50%	10%	35,463
SAMN03287645	NA	embryonic, Lhx2 negative cells, 35hpf, r2, (Strongylocentrotus purpuratus, SAMN03287645)	133,577,828	49%	12%	169,457
SAMN04217001	26800861	FOOT (Strongylocentrotus purpuratus, SAMN04217001)	38,738,340	42%	9%	33,605
SAMN04346832	26657764	whole embryo (Strongylocentrotus purpuratus, SAMN04346832)	55,311,890	66%	6%	147,445
SAMN04346833	26657764	whole embryo (Strongylocentrotus purpuratus, SAMN04346833)	55,082,130	67%	5%	143,505
SAMN06562506	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562506)	33,397,244	63%	6%	137,913
SAMN06562507	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562507)	32,519,052	63%	5%	132,895
SAMN06562508	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562508)	30,125,974	63%	6%	129,436
SAMN06562509	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562509)	32,948,190	62%	5%	126,649
SAMN06562510	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562510)	32,575,504	59%	5%	128,713
SAMN06562511	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562511)	29,105,044	58%	5%	119,569
SAMN06562512	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562512)	35,329,878	69%	11%	165,936
SAMN06562513	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562513)	30,127,648	65%	7%	141,001
SAMN06562514	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562514)	30,551,356	67%	8%	152,254
SAMN06562515	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562515)	36,276,484	62%	6%	138,177
SAMN06562516	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562516)	34,985,508	62%	6%	139,625
SAMN06562517	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562517)	31,353,704	60%	5%	130,216
SAMN06562518	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562518)	28,691,396	61%	5%	116,896
SAMN06562519	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562519)	28,857,988	61%	5%	118,367
SAMN06562520	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562520)	29,282,806	58%	5%	108,970
SAMN06562521	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562521)	31,601,162	57%	5%	110,226
SAMN06562522	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562522)	33,596,294	55%	4%	111,297
SAMN06562523	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562523)	27,646,060	57%	5%	113,851
SAMN06562524	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562524)	32,958,712	70%	11%	162,309
SAMN06562525	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562525)	29,483,018	68%	11%	157,011
SAMN06562526	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562526)	26,096,824	70%	10%	152,813
SAMN06562527	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562527)	34,885,400	70%	10%	162,784
SAMN06562528	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562528)	33,168,278	70%	12%	161,503
SAMN06562529	NA	whole animal (Strongylocentrotus purpuratus, 30 hpf, SAMN06562529)	27,310,964	69%	11%	155,990
SAMN06671004	29180615	Esrp SplMO knockdown 24hpf sea urchin embryos replicate 2 run 1 (Strongylocentrotus purpuratus, SAMN06671004)	174,722,976	67%	22%	198,038
SAMN06671005	29180615	Esrp SplMO knockdown 24hpf sea urchin embryos replicate 1 run 1 (Strongylocentrotus purpuratus, SAMN06671005)	165,246,936	64%	24%	198,802
SAMN06671006	29180615	Control 24hpf sea urchin embryos replicate 2 run 1 (Strongylocentrotus purpuratus, SAMN06671006)	145,846,880	70%	24%	198,193
SAMN06671007	29180615	Control 24hpf sea urchin embryos replicate 1 run 1 (Strongylocentrotus purpuratus, SAMN06671007)	154,788,662	69%	24%	195,311
SAMN08812648	NA	Whole organisms (Strongylocentrotus purpuratus, SAMN08812648)	58,542,722	26%	32%	33,990
SAMN08812649	NA	Whole organisms (Strongylocentrotus purpuratus, SAMN08812649)	53,445,786	44%	13%	25,794
SAMN08812650	NA	Whole organisms (Strongylocentrotus purpuratus, SAMN08812650)	49,371,770	29%	44%	39,046
SAMN08812651	NA	Whole organisms (Strongylocentrotus purpuratus, SAMN08812651)	49,921,728	29%	43%	31,041
SAMN11355803	NA	eggs (Strongylocentrotus purpuratus, SAMN11355803)	9,805,080	72%	18%	100,323
SAMN11355804	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355804)	10,390,794	73%	18%	107,031
SAMN11355805	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355805)	11,352,628	73%	18%	116,670
SAMN11355806	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355806)	14,251,238	74%	17%	122,305
SAMN11355807	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355807)	16,511,700	74%	17%	124,493
SAMN11355808	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355808)	15,226,846	74%	17%	122,772
SAMN11355809	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355809)	16,375,318	74%	16%	123,255
SAMN11355810	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355810)	15,236,192	73%	16%	122,398
SAMN11355811	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355811)	10,023,708	71%	17%	114,020
SAMN11355812	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355812)	33,548,552	67%	12%	130,690
SAMN11355813	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355813)	14,353,426	71%	16%	130,819
SAMN11355814	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355814)	14,270,382	73%	18%	147,555
SAMN11355815	NA	pooled embryos (Strongylocentrotus purpuratus, SAMN11355815)	10,513,994	72%	16%	133,981
SAMN12137531	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137531)	28,977,893	65%	12%	155,853
SAMN12137532	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137532)	30,324,052	64%	12%	155,483
SAMN12137533	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137533)	30,101,403	66%	13%	157,230
SAMN12137534	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137534)	32,276,394	62%	10%	153,959
SAMN12137535	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137535)	27,118,175	62%	10%	149,754
SAMN12137536	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137536)	30,091,439	62%	10%	153,759
SAMN12137537	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137537)	29,309,423	60%	10%	148,956
SAMN12137538	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137538)	28,503,151	62%	11%	151,253
SAMN12137539	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137539)	29,236,051	61%	10%	146,511
SAMN12137540	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137540)	32,846,560	62%	10%	154,650
SAMN12137541	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137541)	29,505,476	60%	10%	147,546
SAMN12137542	NA	larval pool (Strongylocentrotus purpuratus, SAMN12137542)	28,379,978	58%	8%	136,554
SAMN12284739	NA	W1 (Strongylocentrotus purpuratus, SAMN12284739)	227,887,876	66%	11%	117,408
SAMN12284740	NA	DAPT1 (Strongylocentrotus purpuratus, SAMN12284740)	229,121,050	63%	9%	88,097
SAMN12284741	NA	DMSO1 (Strongylocentrotus purpuratus, SAMN12284741)	225,015,162	65%	7%	98,995

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR531843	SRX173220	SRP014690	SAMN01096960	76,613,634	62%	11%
SRR531853	SRX173252	SRP014690	SAMN01103178	39,658,162	72%	19%
SRR531948	SRX173252	SRP014690	SAMN01103178	41,160,166	71%	19%
SRR531860	SRX173253	SRP014690	SAMN01103179	93,646,010	70%	15%
SRR531949	SRX173266	SRP014690	SAMN01103180	89,348,844	67%	15%
SRR531950	SRX173267	SRP014690	SAMN01103181	57,250,686	74%	19%
SRR531951	SRX173268	SRP014690	SAMN01103182	79,791,290	56%	10%
SRR531952	SRX173269	SRP014690	SAMN01103183	47,355,442	70%	21%
SRR531953	SRX173270	SRP014690	SAMN01103184	79,038,724	61%	11%
SRR533746	SRX173272	SRP014690	SAMN01103185	80,286,226	57%	12%
SRR531954	SRX173273	SRP014690	SAMN01103186	47,021,316	76%	23%
SRR531955	SRX173274	SRP014690	SAMN01103187	81,418,680	60%	11%
SRR531956	SRX173275	SRP014690	SAMN01103188	56,244,358	76%	21%
SRR531957	SRX173276	SRP014690	SAMN01103189	76,017,750	64%	10%
SRR531958	SRX173277	SRP014690	SAMN01103190	83,125,690	74%	17%
SRR531964	SRX173278	SRP014690	SAMN01103191	47,145,708	74%	21%
SRR531996	SRX173279	SRP014690	SAMN01103192	66,574,062	69%	17%
SRR532046	SRX173280	SRP014690	SAMN01103193	72,678,758	64%	8%
SRR532055	SRX173281	SRP014690	SAMN01103194	92,917,150	73%	19%
SRR532074	SRX173282	SRP014690	SAMN01103195	30,751,700	71%	17%
SRR532121	SRX173283	SRP014690	SAMN01103196	78,345,470	64%	10%
SRR532143	SRX173284	SRP014690	SAMN01103197	81,975,694	57%	12%
SRR532151	SRX173285	SRP014690	SAMN01103198	69,001,808	70%	23%
SRR1012313	SRX364633	SRP031458	SAMN02370307	59,164,594	56%	8%
SRR1012339	SRX364659	SRP031458	SAMN02370308	52,288,818	47%	9%
SRR1012340	SRX364662	SRP031458	SAMN02370309	78,493,790	57%	7%
SRR1012342	SRX364663	SRP031458	SAMN02370310	82,005,692	50%	7%
SRR1012401	SRX364664	SRP031458	SAMN02370311	40,419,374	53%	10%
SRR1012403	SRX364720	SRP031458	SAMN02370312	38,748,870	45%	8%
SRR1042899	SRX385990	SRP033427	SAMN02371530	60,659,290	59%	6%
SRR1042838	SRX385992	SRP033427	SAMN02371531	65,417,938	66%	6%
SRR1042830	SRX385986	SRP033427	SAMN02371532	65,667,490	70%	7%
SRR1042834	SRX385988	SRP033427	SAMN02371533	70,356,744	70%	6%
SRR1139792	SRX446358	SRP034874	SAMN01907389	57,554,898	76%	25%
SRR1765910	SRX847426	SRP052830	SAMN03287632	75,895,632	44%	3%
SRR1765938	SRX847465	SRP052830	SAMN03287633	73,343,302	46%	6%
SRR1765978	SRX847507	SRP052830	SAMN03287636	184,350,556	55%	9%
SRR1765979	SRX847509	SRP052830	SAMN03287637	91,667,376	48%	11%
SRR1765980	SRX847510	SRP052830	SAMN03287638	61,394,200	55%	12%
SRR1765981	SRX847511	SRP052830	SAMN03287639	56,417,690	55%	11%
SRR1765982	SRX847512	SRP052830	SAMN03287640	53,171,912	60%	11%
SRR1765983	SRX847513	SRP052830	SAMN03287641	35,025,870	58%	12%
SRR1765984	SRX847514	SRP052830	SAMN03287642	55,344,488	51%	10%
SRR1765986	SRX847515	SRP052830	SAMN03287643	83,759,108	60%	10%
SRR1765988	SRX847517	SRP052830	SAMN03287644	125,767,076	50%	10%
SRR1765991	SRX847519	SRP052830	SAMN03287645	133,577,828	49%	12%
SRR2846101	SRX1392822	SRP065431	SAMN04217001	38,738,340	42%	9%
SRR3017856	SRX1485315	SRP067439	SAMN04346832	55,311,890	66%	6%
SRR3017857	SRX1485316	SRP067439	SAMN04346833	55,082,130	67%	5%
SRR5398460	SRX2693206	SRP102812	SAMN06671004	174,722,976	67%	22%
SRR5398459	SRX2693205	SRP102812	SAMN06671005	165,246,936	64%	24%
SRR5398458	SRX2693204	SRP102812	SAMN06671006	145,846,880	70%	24%
SRR5398457	SRX2693203	SRP102812	SAMN06671007	154,788,662	69%	24%
SRR6466760	SRX3556726	SRP128972	SAMN06562506	33,397,244	63%	6%
SRR6466759	SRX3556727	SRP128972	SAMN06562507	32,519,052	63%	5%
SRR6466762	SRX3556724	SRP128972	SAMN06562508	30,125,974	63%	6%
SRR6466761	SRX3556725	SRP128972	SAMN06562509	32,948,190	62%	5%
SRR6466755	SRX3556730	SRP128972	SAMN06562510	32,575,504	59%	5%
SRR6466758	SRX3556731	SRP128972	SAMN06562511	29,105,044	58%	5%
SRR6466757	SRX3556728	SRP128972	SAMN06562512	35,329,878	69%	11%
SRR6466756	SRX3556729	SRP128972	SAMN06562513	30,127,648	65%	7%
SRR6466763	SRX3556722	SRP128972	SAMN06562514	30,551,356	67%	8%
SRR6466766	SRX3556723	SRP128972	SAMN06562515	36,276,484	62%	6%
SRR6466776	SRX3556710	SRP128972	SAMN06562516	34,985,508	62%	6%
SRR6466775	SRX3556711	SRP128972	SAMN06562517	31,353,704	60%	5%
SRR6466773	SRX3556712	SRP128972	SAMN06562518	28,691,396	61%	5%
SRR6466772	SRX3556713	SRP128972	SAMN06562519	28,857,988	61%	5%
SRR6466771	SRX3556714	SRP128972	SAMN06562520	29,282,806	58%	5%
SRR6466774	SRX3556715	SRP128972	SAMN06562521	31,601,162	57%	5%
SRR6466770	SRX3556716	SRP128972	SAMN06562522	33,596,294	55%	4%
SRR6466769	SRX3556717	SRP128972	SAMN06562523	27,646,060	57%	5%
SRR6466778	SRX3556708	SRP128972	SAMN06562524	32,958,712	70%	11%
SRR6466777	SRX3556709	SRP128972	SAMN06562525	29,483,018	68%	11%
SRR6466767	SRX3556719	SRP128972	SAMN06562526	26,096,824	70%	10%
SRR6466768	SRX3556718	SRP128972	SAMN06562527	34,885,400	70%	10%
SRR6466764	SRX3556721	SRP128972	SAMN06562528	33,168,278	70%	12%
SRR6466765	SRX3556720	SRP128972	SAMN06562529	27,310,964	69%	11%
SRR6912802	SRX3860695	SRP136671	SAMN08812648	58,542,722	26%	32%
SRR6912801	SRX3860696	SRP136671	SAMN08812649	53,445,786	44%	13%
SRR6912800	SRX3860697	SRP136671	SAMN08812650	49,371,770	29%	44%
SRR6912799	SRX3860698	SRP136671	SAMN08812651	49,921,728	29%	43%
SRR8863032	SRX5650447	SRP191285	SAMN11355803	9,805,080	72%	18%
SRR8863033	SRX5650446	SRP191285	SAMN11355804	10,390,794	73%	18%
SRR8863030	SRX5650449	SRP191285	SAMN11355805	11,352,628	73%	18%
SRR8863031	SRX5650448	SRP191285	SAMN11355806	14,251,238	74%	17%
SRR8863036	SRX5650443	SRP191285	SAMN11355807	16,511,700	74%	17%
SRR8863037	SRX5650442	SRP191285	SAMN11355808	15,226,846	74%	17%
SRR8863034	SRX5650445	SRP191285	SAMN11355809	16,375,318	74%	16%
SRR8863035	SRX5650444	SRP191285	SAMN11355810	15,236,192	73%	16%
SRR8863038	SRX5650441	SRP191285	SAMN11355811	10,023,708	71%	17%
SRR8863039	SRX5650440	SRP191285	SAMN11355812	33,548,552	67%	12%
SRR8863028	SRX5650451	SRP191285	SAMN11355813	14,353,426	71%	16%
SRR8863029	SRX5650450	SRP191285	SAMN11355814	14,270,382	73%	18%
SRR8863027	SRX5650452	SRP191285	SAMN11355815	10,513,994	72%	16%
SRR9595519	SRX6361208	SRP211938	SAMN12137531	28,977,893	65%	12%
SRR9595518	SRX6361209	SRP211938	SAMN12137532	30,324,052	64%	12%
SRR9595517	SRX6361210	SRP211938	SAMN12137533	30,101,403	66%	13%
SRR9595516	SRX6361211	SRP211938	SAMN12137534	32,276,394	62%	10%
SRR9595514	SRX6361212	SRP211938	SAMN12137535	27,118,175	62%	10%
SRR9595513	SRX6361213	SRP211938	SAMN12137536	30,091,439	62%	10%
SRR9595512	SRX6361214	SRP211938	SAMN12137537	29,309,423	60%	10%
SRR9595515	SRX6361215	SRP211938	SAMN12137538	28,503,151	62%	11%
SRR9595511	SRX6361216	SRP211938	SAMN12137539	29,236,051	61%	10%
SRR9595510	SRX6361217	SRP211938	SAMN12137540	32,846,560	62%	10%
SRR9595501	SRX6361226	SRP211938	SAMN12137541	29,505,476	60%	10%
SRR9595500	SRX6361227	SRP211938	SAMN12137542	28,379,978	58%	8%
SRR9693266	SRX6451540	SRP214801	SAMN12284739	227,887,876	66%	11%
SRR9693265	SRX6451539	SRP214801	SAMN12284740	229,121,050	63%	9%
SRR9693264	SRX6451538	SRP214801	SAMN12284741	225,015,162	65%	7%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Saccoglossus kowalevskii GenBank	271	218 (80.44%)	218 (80.44%)	65.20%	50.07%
Saccoglossus kowalevskii high-quality model RefSeq (XP_)	6,124	4,445 (72.58%)	4,445 (72.58%)	70.05%	67.67%
Saccoglossus kowalevskii known RefSeq (NP_)	474	427 (90.08%)	427 (90.08%)	70.69%	61.51%
Trichoplax adhaerens GenBank	90	80 (88.89%)	80 (88.89%)	68.82%	83.68%
Crassostrea gigas GenBank	750	390 (52.00%)	390 (52.00%)	69.56%	78.49%
Crassostrea gigas high-quality model RefSeq (XP_)	22,081	11,300 (51.18%)	11,300 (51.18%)	57.43%	39.30%
Crassostrea gigas known RefSeq (NP_)	147	108 (73.47%)	108 (73.47%)	68.26%	67.27%
Nematostella vectensis GenBank	447	360 (80.54%)	360 (80.54%)	65.26%	48.27%
Saccharomyces cerevisiae S288C known RefSeq (NP_)	5,983	1,618 (27.04%)	1,618 (27.04%)	59.99%	51.80%
Hydra vulgaris GenBank	577	302 (52.34%)	302 (52.34%)	64.61%	54.80%
Hydra vulgaris known RefSeq (NP_)	198	119 (60.10%)	119 (60.10%)	58.28%	43.05%
Schistosoma mansoni GenBank	1,385	518 (37.40%)	518 (37.40%)	61.01%	59.77%
Caenorhabditis elegans GenBank	2,395	1,327 (55.41%)	1,327 (55.41%)	60.15%	46.02%
Caenorhabditis elegans known RefSeq (NP_)	28,563	8,799 (30.81%)	8,799 (30.81%)	61.13%	44.84%
Drosophila melanogaster GenBank	28,017	12,955 (46.24%)	12,955 (46.24%)	60.50%	46.41%
Drosophila melanogaster known RefSeq (NP_)	30,546	14,785 (48.40%)	14,785 (48.40%)	62.46%	48.63%
Echinoidea GenBank	1,193	1,163 (97.49%)	1,163 (97.49%)	75.42%	86.73%
Same-species GenBank	1,315	1,291 (98.17%)	1,291 (98.17%)	87.50%	94.21%
Same-species known RefSeq (NP_)	426	411 (96.48%)	411 (96.48%)	86.18%	87.98%
Ciona intestinalis GenBank	1,265	753 (59.53%)	753 (59.53%)	61.46%	41.41%
Ciona intestinalis high-quality model RefSeq (XP_)	11,388	6,530 (57.34%)	6,530 (57.34%)	58.35%	43.20%
Ciona intestinalis known RefSeq (NP_)	942	611 (64.86%)	611 (64.86%)	61.12%	42.77%
Branchiostoma floridae GenBank	457	334 (73.09%)	334 (73.09%)	66.05%	49.35%
Homo sapiens GenBank	130,619	73,762 (56.47%)	73,762 (56.47%)	62.07%	50.96%
Homo sapiens known RefSeq (NP_)	54,125	34,379 (63.52%)	34,379 (63.52%)	60.61%	46.93%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
Spur_5.0 (Current) Coverage: 70.84%	Spur_5.0 (Current) Coverage: 83.42%
Spur_4.2 (Previous) Coverage: 72.24%	Spur_4.2 (Previous) Coverage: 79.55%
Percent Identity: 97.90%	Percent Identity: 97.20%

Comparison of the current and previous annotations

The annotation produced for this release (102) was compared to the annotation in the previous release (101) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Spur_5.0 (Current) to Spur_4.2 (Previous)
Identical	2%
Minor changes	33%
Major changes	26%
New	31%
Deprecated	32%
Other	9%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences