NCBI Citrus sinensis Annotation Release 102

The RefSeq genome records for Citrus sinensis were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Citrus sinensis Annotation Release 102

Annotation release ID: 102
Date of Entrez queries for transcripts and proteins: May 7 2018
Date of submission of annotation to the public databases: May 16 2018
Software version: 8.0

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Csi_valencia_1.0	GCF_000317415.1	China sweet orange genome project	12-12-2012	Reference	10 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Csi_valencia_1.0
Genes and pseudogenes	30,113
protein-coding	24,543
non-coding	4,113
transcribed pseudogenes	49
non-transcribed pseudogenes	1,408
genes with variants	8,031
immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	38,969
fully-supported	34,680
with > 5% ab initio	3,507
partial	324
with filled gap(s)	2
known RefSeq (NM_)	126
model RefSeq (XM_)	38,843
non-coding RNAs	8,171
fully-supported	6,636
with > 5% ab initio	0
partial	2
with filled gap(s)	0
known RefSeq (NR_)	23
model RefSeq (XR_)	7,707
pseudo transcripts	49
fully-supported	41
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	49
CDSs	39,056
fully-supported	34,680
with > 5% ab initio	3,582
partial	315
with major correction(s)	229
known RefSeq (NP_)	126
model RefSeq (XP_)	38,930

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	28,656	3,714	2,572	23	577,272
All transcripts	47,140	1,882	1,651	20	16,759
mRNA	38,969	1,989	1,733	177	16,759
misc_RNA	2,922	2,312	1,980	105	9,684
miRNA	23	21	21	20	24
tRNA	433	73	73	23	93
lncRNA	3,692	1,162	832	81	8,413
snoRNA	998	107	107	64	216
snRNA	69	142	151	100	196
rRNA	34	708	156	103	3,475
Single-exon transcripts	4,036	1,351	1,155	240	7,900
coding transcripts (NM_/XM_ )	4,035	1,351	1,155	240	7,900
non-coding transcripts (NR_/XR_ )	1	1,536	1,536	1,536	1,536
CDSs	39,056	1,454	1,197	90	16,299
Exons	167,900	336	177	1	10,435
in coding transcripts (NM_/XM_ )	154,850	333	173	1	10,435
in non-coding transcripts (NR_/XR_ )	22,298	303	164	2	6,304
Introns	131,810	548	190	30	575,627
in coding transcripts (NM_/XM_ )	123,673	534	187	30	575,627
in non-coding transcripts (NR_/XR_ )	17,186	597	222	30	302,930

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.66	1	1	40
Number of exons per transcript	6.48	5	1	79

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Arabidopsis thaliana known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 24456 coding genes, 22468 genes had a protein with an alignment covering 50% or more of the query and 10430 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Arabidopsis thaliana known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Csi_valencia_1.0	GCF_000317415.1	2.70%	28.73%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	152	152 (100.00%)	148 (97.37%)	98.97%	99.55%
Same-species Genbank	591	577 (97.63%)	476 (80.54%)	98.95%	97.48%
Same-species EST	214,579	150,717 (70.24%)	128,885 (60.06%)	98.61%	98.00%
Citrus Genbank	1,019	974 (95.58%)	702 (68.89%)	98.54%	98.17%
Citrus EST	354,120	218,747 (61.77%)	204,724 (57.81%)	97.48%	97.36%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	5,174,077,673	84%	28%	164,200
SAMEA2612304	NA	Navelina open flower (Citrus sinensis, SAMEA2612304)	116,252,938	87%	19%	138,510
SAMN02152588	NA	fruit (Citrus sinensis, SAMN02152588)	59,302,836	81%	21%	123,835
SAMN02152589	NA	fruit (Citrus sinensis, SAMN02152589)	57,847,438	57%	22%	116,365
SAMN02152590	NA	fruit (Citrus sinensis, SAMN02152590)	58,400,764	81%	23%	124,366
SAMN02152591	NA	fruit (Citrus sinensis, SAMN02152591)	61,870,162	83%	22%	127,085
SAMN02152592	NA	fruit (Citrus sinensis, SAMN02152592)	42,448,392	76%	20%	108,613
SAMN02152593	NA	fruit (Citrus sinensis, SAMN02152593)	24,821,354	76%	21%	103,902
SAMN02152594	NA	fruit (Citrus sinensis, SAMN02152594)	31,741,484	78%	20%	106,068
SAMN02152595	NA	fruit (Citrus sinensis, SAMN02152595)	31,250,608	72%	20%	105,260
SAMN02152596	NA	leaf (Citrus sinensis, SAMN02152596)	72,427,648	77%	21%	120,522
SAMN02152597	NA	leaf (Citrus sinensis, SAMN02152597)	64,574,052	86%	22%	127,007
SAMN02152598	NA	leaf (Citrus sinensis, SAMN02152598)	72,069,278	84%	22%	128,362
SAMN02152599	NA	leaf (Citrus sinensis, SAMN02152599)	46,379,294	82%	20%	122,256
SAMN02152600	NA	leaf (Citrus sinensis, SAMN02152600)	63,975,320	85%	21%	123,380
SAMN02152601	NA	leaf (Citrus sinensis, SAMN02152601)	68,517,388	83%	21%	123,516
SAMN02152602	NA	leaf (Citrus sinensis, SAMN02152602)	67,160,276	63%	19%	113,769
SAMN02152603	NA	leaf (Citrus sinensis, SAMN02152603)	65,826,060	72%	19%	115,681
SAMN03072809	NA	leaf (Citrus sinensis, SAMN03072809)	175,092,160	86%	23%	140,811
SAMN03754511	NA	pulp, mutant, 170 daf (Citrus sinensis, SAMN03754511)	7,308,482	87%	9%	74,797
SAMN03754512	NA	pulp, mutant, 190 daf (Citrus sinensis, SAMN03754512)	7,176,196	87%	9%	72,523
SAMN03754513	NA	pulp, mutant, 210 daf (Citrus sinensis, SAMN03754513)	7,057,488	87%	9%	70,600
SAMN03754514	NA	pulp, wild type, 170 daf (Citrus sinensis, SAMN03754514)	7,613,995	87%	9%	75,802
SAMN03754515	NA	pulp, wild type, 190 daf (Citrus sinensis, SAMN03754515)	7,406,448	87%	9%	74,761
SAMN03754516	NA	pulp, wild type, 210 daf (Citrus sinensis, SAMN03754516)	7,239,886	87%	9%	71,113
SAMN04687553	NA	endocarps (Citrus sinensis, SAMN04687553)	43,641,132	94%	27%	121,868
SAMN04687556	NA	endocarps (Citrus sinensis, SAMN04687556)	44,042,498	94%	26%	118,990
SAMN04687560	NA	endocarps (Citrus sinensis, SAMN04687560)	43,777,664	94%	26%	121,840
SAMN04687564	NA	endocarps (Citrus sinensis, SAMN04687564)	38,802,440	93%	25%	121,465
SAMN05607900	28337215	seedling; root tips (Citrus sinensis, SAMN05607900)	26,507,746	90%	30%	120,895
SAMN05607901	28337215	seedling; root tips (Citrus sinensis, SAMN05607901)	28,487,340	90%	30%	121,844
SAMN05607902	28337215	seedling; root tips (Citrus sinensis, SAMN05607902)	29,532,684	90%	30%	121,631
SAMN05607907	28337215	seedling; root tips (Citrus sinensis, SAMN05607907)	24,282,724	90%	31%	119,040
SAMN05712341	28394353	embryo (Citrus sinensis, SAMN05712341)	44,233,548	89%	32%	122,906
SAMN05712358	28394353	embryo (Citrus sinensis, SAMN05712358)	50,978,408	90%	33%	127,586
SAMN05712408	28394353	leaf (Citrus sinensis, SAMN05712408)	64,838,894	87%	32%	130,158
SAMN05712606	28394353	leaf (Citrus sinensis, Mature, SAMN05712606)	65,263,236	86%	31%	131,382
SAMN05712968	28394353	seed (Citrus sinensis, SAMN05712968)	86,800,154	88%	31%	132,606
SAMN05715218	28394353	fruit (Citrus sinensis, SAMN05715218)	42,900,300	91%	32%	118,489
SAMN05715222	28394353	fruit (Citrus sinensis, SAMN05715222)	41,732,330	91%	34%	120,276
SAMN05938709	NA	leaf (Citrus sinensis, not collected, SAMN05938709)	23,824,558	70%	35%	110,118
SAMN05938710	NA	leaf (Citrus sinensis, not collected, SAMN05938710)	25,997,762	76%	33%	112,154
SAMN05938711	NA	leaf (Citrus sinensis, not collected, SAMN05938711)	24,544,818	78%	34%	113,039
SAMN05938712	NA	leaf (Citrus sinensis, not collected, SAMN05938712)	27,725,778	77%	34%	114,415
SAMN05938713	NA	leaf (Citrus sinensis, not collected, SAMN05938713)	28,350,704	74%	33%	118,025
SAMN05938714	NA	leaf (Citrus sinensis, not collected, SAMN05938714)	24,249,298	54%	31%	106,135
SAMN05938715	NA	leaf (Citrus sinensis, not collected, SAMN05938715)	26,982,872	75%	35%	116,016
SAMN05938716	NA	leaf (Citrus sinensis, not collected, SAMN05938716)	26,802,898	81%	36%	120,304
SAMN07142569	NA	fruit flesh (Citrus sinensis, SAMN07142569)	74,325,552	88%	36%	119,199
SAMN07142570	NA	fruit flesh (Citrus sinensis, SAMN07142570)	56,289,404	84%	37%	118,398
SAMN07142595	NA	fruit flesh (Citrus sinensis, SAMN07142595)	53,872,726	90%	37%	118,198
SAMN07142597	NA	fruit flesh (Citrus sinensis, SAMN07142597)	55,984,038	88%	36%	116,600
SAMN07142600	NA	fruit flesh (Citrus sinensis, SAMN07142600)	59,262,010	84%	37%	113,441
SAMN07142639	NA	fruit flesh (Citrus sinensis, SAMN07142639)	56,903,316	84%	35%	117,883
SAMN07142641	NA	fruit flesh (Citrus sinensis, SAMN07142641)	46,979,364	81%	37%	113,860
SAMN07142643	NA	fruit flesh (Citrus sinensis, SAMN07142643)	46,002,510	83%	36%	112,220
SAMN07142644	NA	fruit flesh (Citrus sinensis, SAMN07142644)	43,606,340	83%	36%	111,122
SAMN07142645	NA	fruit flesh (Citrus sinensis, SAMN07142645)	48,054,370	84%	37%	113,990
SAMN07142646	NA	fruit flesh (Citrus sinensis, SAMN07142646)	42,167,140	84%	37%	110,541
SAMN07142647	NA	fruit flesh (Citrus sinensis, SAMN07142647)	43,057,656	84%	37%	112,219
SAMN07142655	NA	fruit flesh (Citrus sinensis, SAMN07142655)	43,284,196	83%	36%	116,021
SAMN07142660	NA	fruit flesh (Citrus sinensis, SAMN07142660)	41,922,836	75%	36%	114,039
SAMN07142661	NA	fruit flesh (Citrus sinensis, SAMN07142661)	53,853,824	84%	36%	115,935
SAMN07142713	NA	fruit flesh (Citrus sinensis, SAMN07142713)	41,055,766	83%	37%	112,287
SAMN07142732	NA	fruit flesh (Citrus sinensis, SAMN07142732)	43,002,236	83%	37%	113,907
SAMN07142733	NA	fruit flesh (Citrus sinensis, SAMN07142733)	42,758,698	83%	37%	112,030
SAMN07348246	NA	calyx abscission zone (Citrus sinensis, SAMN07348246)	144,753,212	78%	26%	133,724
SAMN07348247	NA	calyx abscission zone (Citrus sinensis, SAMN07348247)	150,572,022	93%	26%	136,622
SAMN07348248	NA	calyx abscission zone (Citrus sinensis, SAMN07348248)	151,347,232	78%	25%	137,513
SAMN07348249	NA	calyx abscission zone (Citrus sinensis, SAMN07348249)	155,615,172	92%	26%	139,483
SAMN07348250	NA	calyx abscission zone (Citrus sinensis, SAMN07348250)	153,140,808	78%	25%	137,706
SAMN07348251	NA	calyx abscission zone (Citrus sinensis, SAMN07348251)	144,912,308	78%	25%	137,704
SAMN07348252	NA	calyx abscission zone (Citrus sinensis, SAMN07348252)	143,598,034	78%	23%	134,190
SAMN07348253	NA	calyx abscission zone (Citrus sinensis, SAMN07348253)	157,845,172	92%	24%	136,728
SAMN08323509	NA	leaves (Citrus sinensis, SAMN08323509)	52,418,986	88%	35%	124,455
SAMN08323510	NA	leaves (Citrus sinensis, SAMN08323510)	46,369,802	86%	34%	126,332
SAMN08323511	NA	leaves (Citrus sinensis, SAMN08323511)	52,498,158	88%	36%	125,637
SAMN08323512	NA	leaves (Citrus sinensis, SAMN08323512)	53,077,984	88%	35%	125,634
SAMN08326431	NA	Flesh (Citrus sinensis, SAMN08326431)	27,467,716	85%	26%	109,421
SAMN08326432	NA	Flesh (Citrus sinensis, SAMN08326432)	33,796,748	90%	24%	115,573
SAMN08326433	NA	Flesh (Citrus sinensis, SAMN08326433)	19,522,664	87%	24%	107,015
SAMN08326434	NA	Flesh (Citrus sinensis, SAMN08326434)	45,003,960	88%	26%	116,565
SAMN08326435	NA	Flesh (Citrus sinensis, SAMN08326435)	41,452,864	88%	26%	116,549
SAMN08326436	NA	Flesh (Citrus sinensis, SAMN08326436)	41,909,534	88%	25%	115,364
SAMN08326437	NA	Flesh (Citrus sinensis, SAMN08326437)	31,034,604	87%	27%	112,695
SAMN08326438	NA	Flesh (Citrus sinensis, SAMN08326438)	36,911,084	89%	27%	116,150
SAMN08326439	NA	Flesh (Citrus sinensis, SAMN08326439)	40,478,274	89%	27%	115,816
SAMN08326440	NA	Flesh (Citrus sinensis, SAMN08326440)	33,019,090	89%	25%	115,425
SAMN08326441	NA	Flesh (Citrus sinensis, SAMN08326441)	21,638,930	90%	25%	107,228
SAMN08326442	NA	Flesh (Citrus sinensis, SAMN08326442)	25,585,664	87%	25%	107,745
SAMN08326443	NA	Flesh (Citrus sinensis, SAMN08326443)	22,055,818	88%	25%	108,269
SAMN08326444	NA	Flesh (Citrus sinensis, SAMN08326444)	28,456,646	89%	25%	112,887
SAMN08326445	NA	Flesh (Citrus sinensis, SAMN08326445)	17,439,672	89%	26%	104,614
SAMN08326446	NA	Flesh (Citrus sinensis, SAMN08326446)	31,100,776	90%	27%	113,174
SAMN08326447	NA	Flesh (Citrus sinensis, SAMN08326447)	49,726,286	90%	28%	119,448
SAMN08326448	NA	Flesh (Citrus sinensis, SAMN08326448)	22,574,834	91%	28%	107,868
SAMN08326449	NA	Flesh (Citrus sinensis, SAMN08326449)	19,029,164	80%	24%	93,779
SAMN08326450	NA	Flesh (Citrus sinensis, SAMN08326450)	16,902,512	81%	24%	95,785
SAMN08326451	NA	Flesh (Citrus sinensis, SAMN08326451)	25,248,434	87%	24%	101,511
SAMN08326452	NA	Flesh (Citrus sinensis, SAMN08326452)	23,192,208	84%	25%	102,779
SAMN08326453	NA	Flesh (Citrus sinensis, SAMN08326453)	28,799,390	84%	25%	105,118
SAMN08326454	NA	Flesh (Citrus sinensis, SAMN08326454)	21,300,848	84%	26%	102,998
SAMN08326455	NA	Flesh (Citrus sinensis, SAMN08326455)	26,352,062	80%	26%	101,861
SAMN08326456	NA	Flesh (Citrus sinensis, SAMN08326456)	19,112,222	79%	27%	93,130
SAMN08326457	NA	Flesh (Citrus sinensis, SAMN08326457)	18,375,064	84%	26%	97,938

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
ERR760716	ERX705477	ERP005867	SAMEA2612304	116,252,938	87%	19%
SRR867166	SRX286067	SRP022979	SAMN02152588	59,302,836	81%	21%
SRR867394	SRX286482	SRP022979	SAMN02152589	57,847,438	57%	22%
SRR867395	SRX286483	SRP022979	SAMN02152590	58,400,764	81%	23%
SRR867396	SRX286484	SRP022979	SAMN02152591	61,870,162	83%	22%
SRR867397	SRX286485	SRP022979	SAMN02152592	42,448,392	76%	20%
SRR867398	SRX286486	SRP022979	SAMN02152593	24,821,354	76%	21%
SRR867421	SRX286487	SRP022979	SAMN02152594	31,741,484	78%	20%
SRR867423	SRX286499	SRP022979	SAMN02152595	31,250,608	72%	20%
SRR867425	SRX286500	SRP022979	SAMN02152596	72,427,648	77%	21%
SRR867426	SRX286501	SRP022979	SAMN02152597	64,574,052	86%	22%
SRR867427	SRX286502	SRP022979	SAMN02152598	72,069,278	84%	22%
SRR867431	SRX286506	SRP022979	SAMN02152599	46,379,294	82%	20%
SRR867435	SRX286507	SRP022979	SAMN02152600	63,975,320	85%	21%
SRR867442	SRX286508	SRP022979	SAMN02152601	34,207,362	82%	21%
SRR867443	SRX286508	SRP022979	SAMN02152601	34,310,026	84%	21%
SRR867446	SRX286509	SRP022979	SAMN02152602	67,160,276	63%	19%
SRR867449	SRX286510	SRP022979	SAMN02152603	65,826,060	72%	19%
SRR1581242	SRX706564	SRP047324	SAMN03072809	41,675,618	85%	23%
SRR1581243	SRX706564	SRP047324	SAMN03072809	44,790,592	85%	23%
SRR1581244	SRX706564	SRP047324	SAMN03072809	39,411,752	84%	23%
SRR1581245	SRX706564	SRP047324	SAMN03072809	49,214,198	88%	24%
SRR2046811	SRX1045026	SRP058943	SAMN03754511	7,308,482	87%	9%
SRR2046812	SRX1045027	SRP058943	SAMN03754512	7,176,196	87%	9%
SRR2046813	SRX1045028	SRP058943	SAMN03754513	7,057,488	87%	9%
SRR2046814	SRX1045029	SRP058943	SAMN03754514	7,613,995	87%	9%
SRR2046815	SRX1045030	SRP058943	SAMN03754515	7,406,448	87%	9%
SRR2046816	SRX1045031	SRP058943	SAMN03754516	7,239,886	87%	9%
SRR3370958	SRX1700580	SRP073205	SAMN04687553	21,820,566	94%	27%
SRR3370959	SRX1700580	SRP073205	SAMN04687553	21,820,566	94%	27%
SRR3370960	SRX1700581	SRP073205	SAMN04687556	22,021,249	94%	26%
SRR3370961	SRX1700581	SRP073205	SAMN04687556	22,021,249	94%	26%
SRR3370962	SRX1700582	SRP073205	SAMN04687560	21,888,832	94%	26%
SRR3370963	SRX1700582	SRP073205	SAMN04687560	21,888,832	94%	26%
SRR3370964	SRX1700583	SRP073205	SAMN04687564	19,401,220	93%	25%
SRR3370965	SRX1700583	SRP073205	SAMN04687564	19,401,220	93%	25%
SRR4050294	SRX2039875	SRP082577	SAMN05607900	26,507,746	90%	30%
SRR4050293	SRX2039874	SRP082577	SAMN05607901	28,487,340	90%	30%
SRR4050292	SRX2039873	SRP082577	SAMN05607902	29,532,684	90%	30%
SRR4050295	SRX2039876	SRP082577	SAMN05607907	24,282,724	90%	31%
SRR4072700	SRX2058587	SRP083096	SAMN05712341	44,233,548	89%	32%
SRR4076115	SRX2059197	SRP083096	SAMN05712358	50,978,408	90%	33%
SRR4076705	SRX2059236	SRP083096	SAMN05712408	64,838,894	87%	32%
SRR4084239	SRX2059966	SRP083096	SAMN05712606	65,263,236	86%	31%
SRR4089854	SRX2060764	SRP083096	SAMN05712968	44,029,170	88%	31%
SRR4294693	SRX2190140	SRP083096	SAMN05712968	42,770,984	88%	31%
SRR4096760	SRX2066141	SRP083096	SAMN05715218	42,900,300	91%	32%
SRR4096959	SRX2066235	SRP083096	SAMN05715222	41,732,330	91%	34%
SRR4897538	SRX2326796	SRP092572	SAMN05938709	23,824,558	70%	35%
SRR4897532	SRX2326790	SRP092572	SAMN05938710	25,997,762	76%	33%
SRR4897539	SRX2326797	SRP092572	SAMN05938711	24,544,818	78%	34%
SRR4897535	SRX2326793	SRP092572	SAMN05938712	27,725,778	77%	34%
SRR4897537	SRX2326795	SRP092572	SAMN05938713	28,350,704	74%	33%
SRR4897536	SRX2326794	SRP092572	SAMN05938714	24,249,298	54%	31%
SRR4897533	SRX2326791	SRP092572	SAMN05938715	26,982,872	75%	35%
SRR4897534	SRX2326792	SRP092572	SAMN05938716	26,802,898	81%	36%
SRR5581503	SRX2839612	SRP107708	SAMN07142569	74,325,552	88%	36%
SRR5581502	SRX2839613	SRP107708	SAMN07142570	56,289,404	84%	37%
SRR5581501	SRX2839614	SRP107708	SAMN07142595	53,872,726	90%	37%
SRR5581500	SRX2839615	SRP107708	SAMN07142597	55,984,038	88%	36%
SRR5581499	SRX2839616	SRP107708	SAMN07142600	59,262,010	84%	37%
SRR5581498	SRX2839617	SRP107708	SAMN07142639	56,903,316	84%	35%
SRR5581497	SRX2839618	SRP107708	SAMN07142641	46,979,364	81%	37%
SRR5581496	SRX2839619	SRP107708	SAMN07142643	46,002,510	83%	36%
SRR5581505	SRX2839610	SRP107708	SAMN07142644	43,606,340	83%	36%
SRR5581504	SRX2839611	SRP107708	SAMN07142645	48,054,370	84%	37%
SRR5581511	SRX2839604	SRP107708	SAMN07142646	42,167,140	84%	37%
SRR5581510	SRX2839605	SRP107708	SAMN07142647	43,057,656	84%	37%
SRR5581513	SRX2839602	SRP107708	SAMN07142655	43,284,196	83%	36%
SRR5581512	SRX2839603	SRP107708	SAMN07142660	41,922,836	75%	36%
SRR5581507	SRX2839608	SRP107708	SAMN07142661	53,853,824	84%	36%
SRR5581506	SRX2839609	SRP107708	SAMN07142713	41,055,766	83%	37%
SRR5581509	SRX2839606	SRP107708	SAMN07142732	43,002,236	83%	37%
SRR5581508	SRX2839607	SRP107708	SAMN07142733	42,758,698	83%	37%
SRR5821820	SRX2999682	SRP111785	SAMN07348246	72,694,166	78%	26%
SRR5821821	SRX2999682	SRP111785	SAMN07348246	72,059,046	78%	26%
SRR5821818	SRX2999681	SRP111785	SAMN07348247	74,799,512	93%	26%
SRR5821819	SRX2999681	SRP111785	SAMN07348247	75,772,510	93%	26%
SRR5821816	SRX2999680	SRP111785	SAMN07348248	75,999,914	78%	25%
SRR5821817	SRX2999680	SRP111785	SAMN07348248	75,347,318	78%	25%
SRR5821814	SRX2999679	SRP111785	SAMN07348249	77,098,848	92%	26%
SRR5821815	SRX2999679	SRP111785	SAMN07348249	78,516,324	92%	26%
SRR5821812	SRX2999678	SRP111785	SAMN07348250	76,908,042	78%	25%
SRR5821813	SRX2999678	SRP111785	SAMN07348250	76,232,766	78%	25%
SRR5821810	SRX2999677	SRP111785	SAMN07348251	72,768,744	78%	25%
SRR5821811	SRX2999677	SRP111785	SAMN07348251	72,143,564	78%	25%
SRR5821808	SRX2999676	SRP111785	SAMN07348252	71,987,334	78%	24%
SRR5821809	SRX2999676	SRP111785	SAMN07348252	71,610,700	78%	23%
SRR5821806	SRX2999675	SRP111785	SAMN07348253	78,150,522	92%	24%
SRR5821807	SRX2999675	SRP111785	SAMN07348253	79,694,650	92%	24%
SRR6448482	SRX3539449	SRP128320	SAMN08323509	52,418,986	88%	35%
SRR6448481	SRX3539448	SRP128320	SAMN08323510	46,369,802	86%	34%
SRR6448480	SRX3539447	SRP128320	SAMN08323511	52,498,158	88%	36%
SRR6448479	SRX3539446	SRP128320	SAMN08323512	53,077,984	88%	35%
SRR6451222	SRX3542123	SRP128621	SAMN08326431	27,467,716	85%	26%
SRR6451225	SRX3542124	SRP128621	SAMN08326432	33,796,748	90%	24%
SRR6451221	SRX3542125	SRP128621	SAMN08326433	19,522,664	87%	24%
SRR6451220	SRX3542126	SRP128621	SAMN08326434	45,003,960	88%	26%
SRR6451227	SRX3542119	SRP128621	SAMN08326435	41,452,864	88%	26%
SRR6451226	SRX3542120	SRP128621	SAMN08326436	41,909,534	88%	25%
SRR6451224	SRX3542121	SRP128621	SAMN08326437	31,034,604	87%	27%
SRR6451223	SRX3542122	SRP128621	SAMN08326438	36,911,084	89%	27%
SRR6451219	SRX3542127	SRP128621	SAMN08326439	40,478,274	89%	27%
SRR6451218	SRX3542128	SRP128621	SAMN08326440	33,019,090	89%	25%
SRR6451230	SRX3542115	SRP128621	SAMN08326441	21,638,930	90%	25%
SRR6451233	SRX3542116	SRP128621	SAMN08326442	25,585,664	87%	25%
SRR6451232	SRX3542113	SRP128621	SAMN08326443	22,055,818	88%	25%
SRR6451231	SRX3542114	SRP128621	SAMN08326444	28,456,646	89%	25%
SRR6451235	SRX3542111	SRP128621	SAMN08326445	17,439,672	89%	26%
SRR6451234	SRX3542112	SRP128621	SAMN08326446	31,100,776	90%	27%
SRR6451237	SRX3542109	SRP128621	SAMN08326447	49,726,286	90%	28%
SRR6451236	SRX3542110	SRP128621	SAMN08326448	22,574,834	91%	28%
SRR6451229	SRX3542117	SRP128621	SAMN08326449	19,029,164	80%	24%
SRR6451228	SRX3542118	SRP128621	SAMN08326450	16,902,512	81%	24%
SRR6451217	SRX3542132	SRP128621	SAMN08326451	25,248,434	87%	24%
SRR6451214	SRX3542131	SRP128621	SAMN08326452	23,192,208	84%	25%
SRR6451215	SRX3542130	SRP128621	SAMN08326453	28,799,390	84%	25%
SRR6451216	SRX3542129	SRP128621	SAMN08326454	21,300,848	84%	26%
SRR6451211	SRX3542135	SRP128621	SAMN08326455	26,352,062	80%	26%
SRR6451212	SRX3542134	SRP128621	SAMN08326456	19,112,222	79%	27%
SRR6451213	SRX3542133	SRP128621	SAMN08326457	18,375,064	84%	26%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Citrus GenBank	745	737 (98.93%)	737 (98.93%)	78.83%	88.43%
Same-species GenBank	369	358 (97.02%)	358 (97.02%)	78.33%	87.93%
Same-species known RefSeq (NP_)	129	128 (99.22%)	128 (99.22%)	75.50%	83.33%
Arabidopsis thaliana known RefSeq (NP_)	48,148	42,633 (88.55%)	42,633 (88.55%)	66.51%	71.05%

Comparison of the current and previous annotations

The annotation produced for this release (102) was compared to the annotation in the previous release (101) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Csi_valencia_1.0 (Current) to Csi_valencia_1.0 (Previous)
Identical	9%
Minor changes	66%
Major changes	11%
New	12%
Deprecated	7%
Other	2%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences