NCBI Nicotiana tomentosiformis Annotation Release 102

The RefSeq genome records for Nicotiana tomentosiformis were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Nicotiana tomentosiformis Annotation Release 102

Annotation release ID: 102
Date of Entrez queries for transcripts and proteins: Apr 15 2020
Date of submission of annotation to the public databases: Apr 21 2020
Software version: 8.4

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Ntom_v01	GCF_000390325.2	Philip Morris International R&D	05-16-2013	Reference	1 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Ntom_v01
Genes and pseudogenes	45,485
protein-coding	31,842
non-coding	11,549
transcribed pseudogenes	1
non-transcribed pseudogenes	2,093
genes with variants	10,545
immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	50,010
fully-supported	43,595
with > 5% ab initio	5,796
partial	1,064
with filled gap(s)	3
known RefSeq (NM_)	18
model RefSeq (XM_)	49,992
non-coding RNAs	17,677
fully-supported	14,932
with > 5% ab initio	0
partial	1
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	16,867
pseudo transcripts	1
fully-supported	1
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	1
CDSs	50,112
fully-supported	43,595
with > 5% ab initio	5,885
partial	1,063
with major correction(s)	148
known RefSeq (NP_)	18
model RefSeq (XP_)	50,094

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	43,391	4,856	2,967	58	96,634
All transcripts	67,687	1,657	1,414	58	16,014
mRNA	50,010	1,864	1,604	141	16,014
misc_RNA	3,736	2,128	1,865	105	8,831
tRNA	801	74	73	68	92
lncRNA	11,196	952	581	69	11,670
snoRNA	1,460	107	107	58	227
snRNA	264	136	117	98	202
rRNA	219	267	119	103	3,519
Single-exon transcripts	5,012	1,104	873	141	10,725
coding transcripts (NM_/XM_ )	5,012	1,104	873	141	10,725
CDSs	50,112	1,315	1,071	90	15,312
Exons	221,079	324	170	2	10,725
in coding transcripts (NM_/XM_ )	187,634	326	174	2	10,725
in non-coding transcripts (NR_/XR_ )	42,203	297	136	2	10,652
Introns	169,634	1,058	321	30	89,643
in coding transcripts (NM_/XM_ )	148,013	822	289	30	89,643
in non-coding transcripts (NR_/XR_ )	29,955	2,158	559	30	87,128

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.57	1	1	50
Number of exons per transcript	5.62	4	1	79

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Arabidopsis thaliana known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 31740 coding genes, 25881 genes had a protein with an alignment covering 50% or more of the query and 11730 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Arabidopsis thaliana known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Ntom_v01	GCF_000390325.2	1.49%	53.07%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	18	18 (100.00%)	18 (100.00%)	99.87%	94.63%
Same-species Genbank	43	43 (100.00%)	41 (95.35%)	99.83%	92.35%
Same-species EST	10,437	10,103 (96.80%)	9,760 (93.51%)	99.71%	99.61%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	5,725,442,558	95%	24%	360,344
SAMEA1904896	NA	leaf (Nicotiana tomentosiformis, SAMEA1904896)	142,479,598	95%	25%	158,751
SAMEA1904903	NA	root (Nicotiana tomentosiformis, SAMEA1904903)	212,731,912	93%	25%	168,989
SAMEA1904916	NA	root (Nicotiana tomentosiformis, SAMEA1904916)	210,025,788	94%	26%	168,503
SAMEA1904929	NA	flower (Nicotiana tomentosiformis, SAMEA1904929)	206,891,652	96%	25%	182,209
SAMEA1904940	NA	flower (Nicotiana tomentosiformis, SAMEA1904940)	128,117,186	96%	24%	171,744
SAMEA1904945	NA	leaf (Nicotiana tomentosiformis, SAMEA1904945)	167,391,118	95%	24%	164,364
SAMEA1904950	NA	flower (Nicotiana tomentosiformis, SAMEA1904950)	167,679,480	192%	26%	187,884
SAMEA1904961	NA	root (Nicotiana tomentosiformis, SAMEA1904961)	168,099,458	93%	25%	164,125
SAMEA1904966	NA	leaf (Nicotiana tomentosiformis, SAMEA1904966)	191,495,418	95%	24%	164,274
SAMEA1904968	NA	flower (Nicotiana tomentosiformis, SAMEA1904968)	142,802,043	192%	26%	186,598
SAMN02316609	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Leaf (Nicotiana tabacum, SAMN02316609)	109,139,490	88%	25%	197,725
SAMN02316610	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Leaf (Nicotiana tabacum, SAMN02316610)	164,957,186	90%	22%	208,552
SAMN02316611	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Leaf (Nicotiana tabacum, SAMN02316611)	119,836,094	90%	25%	200,166
SAMN02316612	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02316612)	98,714,710	89%	24%	193,436
SAMN02316613	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02316613)	91,788,176	88%	22%	199,488
SAMN02316614	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02316614)	104,367,224	88%	24%	196,904
SAMN02645674	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Immature Flower (Nicotiana tabacum, SAMN02645674)	97,903,872	91%	26%	206,725
SAMN02645675	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Flower (Nicotiana tabacum, SAMN02645675)	100,846,380	91%	24%	205,269
SAMN02645676	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Senescent Flower (Nicotiana tabacum, SAMN02645676)	57,299,122	92%	21%	171,270
SAMN02645677	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Dry Capsule (Nicotiana tabacum, SAMN02645677)	43,539,758	89%	23%	152,739
SAMN02645678	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Stem (Nicotiana tabacum, SAMN02645678)	52,035,046	90%	22%	162,048
SAMN02645679	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02645679)	77,425,142	87%	23%	179,546
SAMN02645680	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Young Leaf (Nicotiana tabacum, SAMN02645680)	54,262,284	92%	26%	167,971
SAMN02645681	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Leaf (Nicotiana tabacum, SAMN02645681)	69,984,574	91%	25%	177,069
SAMN02645682	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Senescent Leaf (Nicotiana tabacum, SAMN02645682)	80,142,054	90%	25%	186,058
SAMN02645683	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Immature Flower (Nicotiana tabacum, SAMN02645683)	92,768,572	91%	26%	202,189
SAMN02645684	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Flower (Nicotiana tabacum, SAMN02645684)	86,486,460	91%	24%	195,826
SAMN02645685	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Senescent Flower (Nicotiana tabacum, SAMN02645685)	46,017,156	91%	22%	162,326
SAMN02645686	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Dry Capsule (Nicotiana tabacum, SAMN02645686)	57,343,858	89%	23%	163,034
SAMN02645687	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Stem (Nicotiana tabacum, SAMN02645687)	54,634,530	89%	21%	170,553
SAMN02645688	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02645688)	23,162,812	86%	24%	149,837
SAMN02645689	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Young Leaf (Nicotiana tabacum, SAMN02645689)	69,627,388	90%	25%	175,107
SAMN02645690	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Leaf (Nicotiana tabacum, SAMN02645690)	57,260,944	91%	26%	172,089
SAMN02645691	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Senescent Leaf (Nicotiana tabacum, SAMN02645691)	79,186,800	91%	25%	182,063
SAMN02645692	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Immature Flower (Nicotiana tabacum, SAMN02645692)	81,592,536	92%	27%	191,271
SAMN02645693	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Flower (Nicotiana tabacum, SAMN02645693)	88,309,184	91%	24%	201,247
SAMN02645694	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Dry Capsule (Nicotiana tabacum, SAMN02645694)	43,118,862	89%	22%	149,108
SAMN02645695	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Stem (Nicotiana tabacum, SAMN02645695)	86,850,900	90%	23%	191,608
SAMN02645696	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Root (Nicotiana tabacum, SAMN02645696)	86,243,630	86%	21%	188,917
SAMN02645697	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Young Leaf (Nicotiana tabacum, SAMN02645697)	73,650,890	92%	26%	177,767
SAMN02645698	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Mature Leaf (Nicotiana tabacum, SAMN02645698)	76,538,320	91%	26%	180,706
SAMN02645699	24807620,1708498,2057370,8845963,15583938,1337369,8252646,9278490,16453699	Senescent Leaf (Nicotiana tabacum, SAMN02645699)	93,215,816	91%	24%	177,565
SAMN03879960	NA	leaf (Nicotiana tomentosiformis, 30-day-old, SAMN03879960)	134,917,976	96%	19%	146,751
SAMN08281073	NA	flowers (Nicotiana tabacum, SAMN08281073)	285,360,192	90%	29%	209,801
SAMN08285105	NA	60% flowers (Nicotiana tomentosiformis, SAMN08285105)	16,784,471	88%	20%	111,183
SAMN08285106	NA	60% flowers (Nicotiana tomentosiformis, SAMN08285106)	13,843,866	84%	20%	102,450
SAMN08285107	NA	60% flowers (Nicotiana tomentosiformis, SAMN08285107)	16,162,913	88%	20%	107,818
SAMN08285108	NA	85% flowers (Nicotiana tomentosiformis, SAMN08285108)	17,079,089	88%	21%	106,800
SAMN08285109	NA	85% flowers (Nicotiana tomentosiformis, SAMN08285109)	21,693,120	95%	21%	118,308
SAMN08285110	NA	95% flowers (Nicotiana tomentosiformis, SAMN08285110)	21,448,536	83%	18%	110,239
SAMN08285111	NA	95% flowers (Nicotiana tomentosiformis, SAMN08285111)	18,826,465	81%	18%	108,113
SAMN08285112	NA	95% flowers (Nicotiana tomentosiformis, SAMN08285112)	20,351,707	81%	19%	112,249
SAMN08285113	NA	60% flowers (Nicotiana tabacum, SAMN08285113)	18,945,747	76%	20%	116,279
SAMN08285114	NA	60% flowers (Nicotiana tabacum, SAMN08285114)	11,450,208	83%	22%	103,544
SAMN08285115	NA	60% flowers (Nicotiana tabacum, SAMN08285115)	21,439,579	83%	20%	121,624
SAMN08285116	NA	85% flowers (Nicotiana tabacum, SAMN08285116)	14,579,319	81%	21%	108,709
SAMN08285117	NA	85% flowers (Nicotiana tabacum, SAMN08285117)	16,902,887	80%	20%	113,662
SAMN08285118	NA	85% flowers (Nicotiana tabacum, SAMN08285118)	19,515,286	81%	20%	120,841
SAMN08285119	NA	95% flowers (Nicotiana tabacum, SAMN08285119)	15,165,172	84%	20%	119,493
SAMN08285120	NA	95% flowers (Nicotiana tabacum, SAMN08285120)	21,329,797	72%	19%	117,770
SAMN08285121	NA	95% flowers (Nicotiana tabacum, SAMN08285121)	19,902,583	78%	19%	115,989
SAMN08285122	NA	60% flowers (Nicotiana tabacum, SAMN08285122)	12,842,067	79%	21%	102,511
SAMN08285123	NA	60% flowers (Nicotiana tabacum, SAMN08285123)	16,956,788	78%	20%	115,501
SAMN08285124	NA	60% flowers (Nicotiana tabacum, SAMN08285124)	15,264,340	81%	20%	108,016
SAMN08285125	NA	85% flowers (Nicotiana tabacum, SAMN08285125)	15,798,016	82%	20%	108,332
SAMN08285126	NA	85% flowers (Nicotiana tabacum, SAMN08285126)	16,991,815	86%	20%	119,472
SAMN08285127	NA	85% flowers (Nicotiana tabacum, SAMN08285127)	21,906,395	87%	21%	119,574
SAMN08285128	NA	95% flowers (Nicotiana tabacum, SAMN08285128)	15,338,614	74%	20%	107,848
SAMN08285129	NA	95% flowers (Nicotiana tabacum, SAMN08285129)	25,279,488	73%	19%	120,636
SAMN08285130	NA	95% flowers (Nicotiana tabacum, SAMN08285130)	16,140,868	81%	19%	110,749
SAMN08285131	NA	60% flowers (Nicotiana tabacum, SAMN08285131)	16,231,623	76%	19%	115,804
SAMN08285132	NA	60% flowers (Nicotiana tabacum, SAMN08285132)	27,528,794	80%	19%	126,919
SAMN08285133	NA	85% flowers (Nicotiana tabacum, SAMN08285133)	30,995,771	76%	19%	127,665
SAMN08285134	NA	85% flowers (Nicotiana tabacum, SAMN08285134)	27,982,340	78%	19%	126,304
SAMN08285135	NA	85% flowers (Nicotiana tabacum, SAMN08285135)	15,726,870	57%	17%	97,064
SAMN08285136	NA	95% flowers (Nicotiana tabacum, SAMN08285136)	18,354,630	77%	19%	115,147
SAMN08285137	NA	95% flowers (Nicotiana tabacum, SAMN08285137)	21,428,016	75%	19%	119,485
SAMN08285138	NA	60% flowers (Nicotiana tabacum, SAMN08285138)	16,108,049	78%	19%	111,496
SAMN08285139	NA	60% flowers (Nicotiana tabacum, SAMN08285139)	21,387,516	76%	19%	113,538
SAMN08285140	NA	60% flowers (Nicotiana tabacum, SAMN08285140)	14,523,144	66%	18%	91,158
SAMN08285141	NA	85% flowers (Nicotiana tabacum, SAMN08285141)	15,559,239	71%	18%	105,324
SAMN08285142	NA	85% flowers (Nicotiana tabacum, SAMN08285142)	12,290,994	74%	18%	101,539
SAMN08285143	NA	85% flowers (Nicotiana tabacum, SAMN08285143)	16,017,697	63%	19%	102,347
SAMN08285144	NA	95% flowers (Nicotiana tabacum, SAMN08285144)	17,438,692	67%	18%	102,600
SAMN08285145	NA	95% flowers (Nicotiana tabacum, SAMN08285145)	12,220,662	74%	19%	105,854
SAMN08285146	NA	95% flowers (Nicotiana tabacum, SAMN08285146)	18,538,040	77%	18%	111,486
SAMN08285147	NA	flowers (Nicotiana tabacum, SAMN08285147)	284,929,754	90%	30%	209,012

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
ERR274400	ERX248676	ERP002502	SAMEA1904896	142,479,598	95%	25%
ERR274404	ERX248680	ERP002502	SAMEA1904903	212,731,912	93%	25%
ERR274405	ERX248681	ERP002502	SAMEA1904916	210,025,788	94%	26%
ERR274396	ERX248672	ERP002502	SAMEA1904929	206,891,652	96%	25%
ERR274399	ERX248675	ERP002502	SAMEA1904940	128,117,186	96%	24%
ERR274401	ERX248677	ERP002502	SAMEA1904945	167,391,118	95%	24%
ERR274398	ERX248674	ERP002502	SAMEA1904950	167,679,480	192%	26%
ERR274403	ERX248679	ERP002502	SAMEA1904961	168,099,458	93%	25%
ERR274402	ERX248678	ERP002502	SAMEA1904966	191,495,418	95%	24%
ERR274397	ERX248673	ERP002502	SAMEA1904968	142,802,043	192%	26%
SRR955761	SRX338101	SRP029183	SAMN02316609	109,139,490	88%	25%
SRR955762	SRX338102	SRP029183	SAMN02316610	164,957,186	90%	22%
SRR955763	SRX338103	SRP029183	SAMN02316611	119,836,094	90%	25%
SRR955765	SRX338104	SRP029183	SAMN02316612	98,714,710	89%	24%
SRR955766	SRX338105	SRP029183	SAMN02316613	91,788,176	88%	22%
SRR955767	SRX338106	SRP029183	SAMN02316614	104,367,224	88%	24%
SRR1199197	SRX495602	SRP029183	SAMN02645674	97,903,872	91%	26%
SRR1199069	SRX495520	SRP029183	SAMN02645675	100,846,380	91%	24%
SRR1199124	SRX495530	SRP029183	SAMN02645676	57,299,122	92%	21%
SRR1199063	SRX495517	SRP029183	SAMN02645677	43,539,758	89%	23%
SRR1199130	SRX495598	SRP029183	SAMN02645678	52,035,046	90%	22%
SRR1199121	SRX495526	SRP029183	SAMN02645679	77,425,142	87%	23%
SRR1199200	SRX495606	SRP029183	SAMN02645680	54,262,284	92%	26%
SRR1199072	SRX495523	SRP029183	SAMN02645681	69,984,574	91%	25%
SRR1199127	SRX495532	SRP029183	SAMN02645682	80,142,054	90%	25%
SRR1199198	SRX495603	SRP029183	SAMN02645683	92,768,572	91%	26%
SRR1199070	SRX495521	SRP029183	SAMN02645684	86,486,460	91%	24%
SRR1199125	SRX495531	SRP029183	SAMN02645685	46,017,156	91%	22%
SRR1199066	SRX495518	SRP029183	SAMN02645686	57,343,858	89%	23%
SRR1199132	SRX495600	SRP029183	SAMN02645687	54,634,530	89%	21%
SRR1199122	SRX495527	SRP029183	SAMN02645688	23,162,812	86%	24%
SRR1199202	SRX495607	SRP029183	SAMN02645689	69,627,388	90%	25%
SRR1199073	SRX495524	SRP029183	SAMN02645690	57,260,944	91%	26%
SRR1199128	SRX495534	SRP029183	SAMN02645691	79,186,800	91%	25%
SRR1199199	SRX495605	SRP029183	SAMN02645692	81,592,536	92%	27%
SRR1199071	SRX495522	SRP029183	SAMN02645693	88,309,184	91%	24%
SRR1199068	SRX495519	SRP029183	SAMN02645694	43,118,862	89%	22%
SRR1199135	SRX495601	SRP029183	SAMN02645695	86,850,900	90%	23%
SRR1199123	SRX495529	SRP029183	SAMN02645696	86,243,630	86%	21%
SRR1199203	SRX495608	SRP029183	SAMN02645697	73,650,890	92%	26%
SRR1199074	SRX495525	SRP029183	SAMN02645698	76,538,320	91%	26%
SRR1199129	SRX495535	SRP029183	SAMN02645699	93,215,816	91%	24%
SRR2106531	SRX1100400	SRP061277	SAMN03879960	134,917,976	96%	19%
SRR6435699	SRX3527190	SRP127804	SAMN08281073	103,017,144	91%	29%
SRR6435698	SRX3527191	SRP127804	SAMN08281073	84,795,094	90%	30%
SRR6435696	SRX3527193	SRP127804	SAMN08281073	97,547,954	90%	29%
SRR6434940	SRX3526512	SRP127804	SAMN08285105	16,784,471	88%	20%
SRR6434958	SRX3526494	SRP127804	SAMN08285106	13,843,866	84%	20%
SRR6434959	SRX3526493	SRP127804	SAMN08285107	16,162,913	88%	20%
SRR6434956	SRX3526496	SRP127804	SAMN08285108	17,079,089	88%	21%
SRR6434957	SRX3526495	SRP127804	SAMN08285109	21,693,120	95%	21%
SRR6434954	SRX3526498	SRP127804	SAMN08285110	21,448,536	83%	18%
SRR6434955	SRX3526497	SRP127804	SAMN08285111	18,826,465	81%	18%
SRR6434952	SRX3526500	SRP127804	SAMN08285112	20,351,707	81%	19%
SRR6434953	SRX3526499	SRP127804	SAMN08285113	18,945,747	76%	20%
SRR6434966	SRX3526486	SRP127804	SAMN08285114	11,450,208	83%	22%
SRR6434967	SRX3526485	SRP127804	SAMN08285115	21,439,579	83%	20%
SRR6434938	SRX3526514	SRP127804	SAMN08285116	14,579,319	81%	21%
SRR6434937	SRX3526515	SRP127804	SAMN08285117	16,902,887	80%	20%
SRR6434936	SRX3526516	SRP127804	SAMN08285118	19,515,286	81%	20%
SRR6434935	SRX3526517	SRP127804	SAMN08285119	15,165,172	84%	20%
SRR6434934	SRX3526518	SRP127804	SAMN08285120	21,329,797	72%	19%
SRR6434933	SRX3526519	SRP127804	SAMN08285121	19,902,583	78%	19%
SRR6434932	SRX3526520	SRP127804	SAMN08285122	12,842,067	79%	21%
SRR6434931	SRX3526521	SRP127804	SAMN08285123	16,956,788	78%	20%
SRR6434930	SRX3526522	SRP127804	SAMN08285124	15,264,340	81%	20%
SRR6434929	SRX3526523	SRP127804	SAMN08285125	15,798,016	82%	20%
SRR6434960	SRX3526492	SRP127804	SAMN08285126	16,991,815	86%	20%
SRR6434961	SRX3526491	SRP127804	SAMN08285127	21,906,395	87%	21%
SRR6434962	SRX3526490	SRP127804	SAMN08285128	15,338,614	74%	20%
SRR6434963	SRX3526489	SRP127804	SAMN08285129	25,279,488	73%	19%
SRR6434964	SRX3526488	SRP127804	SAMN08285130	16,140,868	81%	19%
SRR6434965	SRX3526487	SRP127804	SAMN08285131	16,231,623	76%	19%
SRR6434949	SRX3526503	SRP127804	SAMN08285132	27,528,794	80%	19%
SRR6434950	SRX3526502	SRP127804	SAMN08285133	30,995,771	76%	19%
SRR6434968	SRX3526484	SRP127804	SAMN08285134	27,982,340	78%	19%
SRR6434969	SRX3526483	SRP127804	SAMN08285135	15,726,870	57%	17%
SRR6434922	SRX3526530	SRP127804	SAMN08285136	18,354,630	77%	19%
SRR6434921	SRX3526531	SRP127804	SAMN08285137	21,428,016	75%	19%
SRR6434924	SRX3526528	SRP127804	SAMN08285138	16,108,049	78%	19%
SRR6434923	SRX3526529	SRP127804	SAMN08285139	21,387,516	76%	19%
SRR6434926	SRX3526526	SRP127804	SAMN08285140	14,523,144	66%	18%
SRR6434925	SRX3526527	SRP127804	SAMN08285141	15,559,239	71%	18%
SRR6434928	SRX3526524	SRP127804	SAMN08285142	12,290,994	74%	18%
SRR6434927	SRX3526525	SRP127804	SAMN08285143	16,017,697	63%	19%
SRR6434920	SRX3526532	SRP127804	SAMN08285144	17,438,692	67%	18%
SRR6434919	SRX3526533	SRP127804	SAMN08285145	12,220,662	74%	19%
SRR6434951	SRX3526501	SRP127804	SAMN08285146	18,538,040	77%	18%
SRR6435697	SRX3527192	SRP127804	SAMN08285147	92,618,458	90%	32%
SRR6435695	SRX3527194	SRP127804	SAMN08285147	95,454,867	91%	29%
SRR6435694	SRX3527195	SRP127804	SAMN08285147	96,856,429	90%	31%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Arabidopsis thaliana known RefSeq (NP_)	48,147	34,897 (72.48%)	34,897 (72.48%)	66.43%	69.45%
Solanaceae GenBank	11,046	7,797 (70.59%)	7,797 (70.59%)	74.25%	83.84%
Solanaceae known RefSeq (NP_)	5,491	5,270 (95.98%)	5,270 (95.98%)	75.93%	85.73%
Nicotiana tabacum GenBank	2,988	1,432 (47.93%)	1,432 (47.93%)	77.46%	87.47%
Same-species GenBank	41	25 (60.98%)	25 (60.98%)	79.45%	82.19%
Same-species known RefSeq (NP_)	18	18 (100.00%)	18 (100.00%)	75.43%	89.73%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
Ntom_v01 (Current) Coverage: 100.00%	Ntom_v01 (Current) Coverage: 100.00%
Ntom_v01 (Previous) Coverage: 99.99%	Ntom_v01 (Previous) Coverage: 99.99%
Percent Identity: 100.00%	Percent Identity: 100.00%

Comparison of the current and previous annotations

The annotation produced for this release (102) was compared to the annotation in the previous release (101) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Ntom_v01 (Current) to Ntom_v01 (Previous)
Identical	14%
Minor changes	59%
Major changes	7%
New	20%
Deprecated	7%
Other	<1%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences