NCBI Apis dorsata Annotation Release 101

The RefSeq genome records for Apis dorsata were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Apis dorsata Annotation Release 101

Annotation release ID: 101
Date of Entrez queries for transcripts and proteins: Oct 28 2019
Date of submission of annotation to the public databases: Nov 5 2019
Software version: 8.2

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Apis dorsata 1.3	GCF_000469605.1	Cold Spring Harbor Laboratory	09-24-2013	Reference	unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Apis dorsata 1.3
Genes and pseudogenes	12,247
protein-coding	10,049
non-coding	2,127
transcribed pseudogenes	2
non-transcribed pseudogenes	69
genes with variants	4,318
immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	20,012
fully-supported	19,336
with > 5% ab initio	227
partial	278
with filled gap(s)	1
known RefSeq (NM_)	2
model RefSeq (XM_)	20,010
non-coding RNAs	3,272
fully-supported	3,081
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	3,129
pseudo transcripts	2
fully-supported	2
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	2
CDSs	20,012
fully-supported	19,336
with > 5% ab initio	262
partial	278
with major correction(s)	310
known RefSeq (NP_)	2
model RefSeq (XP_)	20,010

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	12,176	13,084	3,490	59	834,013
All transcripts	23,284	2,523	1,947	50	56,554
mRNA	20,012	2,708	2,113	189	56,554
misc_RNA	715	2,976	2,440	159	51,262
tRNA	143	74	72	61	84
lncRNA	2,366	1,014	775	50	7,440
snoRNA	13	124	126	68	217
snRNA	25	139	141	59	195
guide_RNA	1	128	128	128	128
rRNA	9	119	119	119	119
Single-exon transcripts	311	1,428	1,194	193	6,640
coding transcripts (NM_/XM_ )	309	1,428	1,194	193	6,640
non-coding transcripts (NR_/XR_ )	2	1,405	1,895	915	1,895
CDSs	20,012	2,033	1,476	93	56,340
Exons	93,273	317	208	1	11,843
in coding transcripts (NM_/XM_ )	86,475	314	207	1	11,843
in non-coding transcripts (NR_/XR_ )	9,973	327	208	2	6,967
Introns	77,776	2,391	161	30	523,322
in coding transcripts (NM_/XM_ )	73,514	2,278	152	30	523,322
in non-coding transcripts (NR_/XR_ )	7,380	3,440	324	30	523,322

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.92	1	1	50
Number of exons per transcript	8.44	7	1	191

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Drosophila melanogaster known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 10049 coding genes, 7715 genes had a protein with an alignment covering 50% or more of the query and 2466 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Drosophila melanogaster known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Apis dorsata 1.3	GCF_000469605.1	5.71%	42.48%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	2	2 (100.00%)	2 (100.00%)	99.27%	99.79%
Same-species Genbank	39	39 (100.00%)	22 (56.41%)	98.74%	99.72%
Apis mellifera known RefSeq (NM_/NR_)	786	708 (90.08%)	533 (67.81%)	95.86%	97.45%
Apis mellifera Genbank	733	626 (85.40%)	350 (47.75%)	95.40%	96.76%
Apis mellifera EST	169,408	111,654 (65.91%)	97,403 (57.50%)	94.89%	97.19%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	8,831,268,567	73%	19%	123,950
SAMN00008880	NA	queen ovary (Apis mellifera, SAMN00008880)	1,357,405	80%	59%	59,233
SAMN00113341	NA	nurse brain #1 (Apis mellifera, SAMN00113341)	9,912,179	72%	14%	59,482
SAMN00120364	NA	nurse brain #2 (Apis mellifera, SAMN00120364)	30,680,547	74%	14%	74,271
SAMN00120365	NA	nurse brain #3 (Apis mellifera, SAMN00120365)	16,911,515	74%	14%	70,236
SAMN00120366	NA	nurse brain #4 (Apis mellifera, SAMN00120366)	14,514,311	73%	15%	69,767
SAMN00120367	NA	nurse brain #5 (Apis mellifera, SAMN00120367)	11,451,917	67%	15%	63,042
SAMN00120368	NA	forager brain #1 (Apis mellifera, SAMN00120368)	15,738,118	72%	15%	66,017
SAMN00120369	NA	forager brain #2 (Apis mellifera, SAMN00120369)	14,679,014	73%	15%	69,046
SAMN00120370	NA	forager brain #3 (Apis mellifera, SAMN00120370)	13,917,711	71%	15%	64,554
SAMN00120371	NA	forager brain #4 (Apis mellifera, SAMN00120371)	25,440,947	72%	15%	70,709
SAMN00120372	NA	forager brain #5 (Apis mellifera, SAMN00120372)	28,566,976	72%	14%	72,130
SAMN00205597	NA	Apis florea ESTs (Apis florea, SAMN00205597)	331,043	55%	28%	20,754
SAMN00750149	NA	brain (Apis cerana cerana, SAMN00750149)	469,162	68%	42%	45,613
SAMN00996631	23254332	generic sample (Apis mellifera, SAMN00996631)	380,046,570	71%	21%	96,254
SAMN01985568	NA	Brain (Apis mellifera carnica, female, SAMN01985568)	175,796,462	82%	22%	96,804
SAMN01985569	NA	Brain (Apis mellifera carnica, female, SAMN01985569)	168,267,063	83%	22%	96,541
SAMN01985570	NA	Brain (Apis mellifera carnica, female, SAMN01985570)	178,934,269	83%	22%	96,741
SAMN01985571	NA	Brain (Apis mellifera carnica, male, SAMN01985571)	189,487,841	82%	22%	96,555
SAMN01985572	NA	Brain (Apis mellifera carnica, male, SAMN01985572)	162,754,857	82%	22%	95,927
SAMN01985573	NA	Brain (Apis mellifera carnica, male, SAMN01985573)	153,692,102	82%	22%	94,878
SAMN02117707	24479613	Whole body (Apis mellifera, SAMN02117707)	814,750	49%	26%	23,949
SAMN02117708	24479613	Abdomen (Apis mellifera, SAMN02117708)	622,279	51%	38%	23,974
SAMN02138536	24479613	Antennae (Apis mellifera, SAMN02138536)	1,135,230	74%	55%	33,525
SAMN02138537	24479613	embryo (Apis mellifera, SAMN02138537)	1,136,885	74%	54%	39,794
SAMN02138538	24479613	Brain and ovary (Apis mellifera, SAMN02138538)	1,556,239	68%	34%	54,147
SAMN02138539	24479613	Testes (Apis mellifera, SAMN02138539)	1,317,973	76%	58%	29,443
SAMN02138540	24479613	Larvae (Apis mellifera, SAMN02138540)	1,088,688	80%	60%	28,057
SAMN02592885	25553907,21857981	General Sample for Apis cerana (Apis cerana, SAMN02592885)	178,189,562	57%	22%	82,058
SAMN02717013	NA	2nd Thoracic Ganglia (Apis mellifera, 1-7 weeks, female, SAMN02717013)	212,137,150	75%	18%	97,266
SAMN02719773	NA	Muscle (Apis mellifera, 1-7 weeks, female, SAMN02719773)	234,805,842	77%	22%	82,196
SAMN02719774	NA	Malpighian tubule (Apis mellifera, 1-7 weeks, female, SAMN02719774)	276,240,608	76%	20%	87,409
SAMN02719775	NA	Hypopharyngeal gland (Apis mellifera, 1-7 weeks, female, SAMN02719775)	257,071,870	76%	28%	77,424
SAMN02719776	NA	Mandibular gland (Apis mellifera, 1-7 weeks, female, SAMN02719776)	250,245,302	72%	19%	86,184
SAMN02719777	NA	Antennae (Apis mellifera, 1-7 weeks, female, SAMN02719777)	205,659,392	60%	21%	95,500
SAMN02719778	NA	Nasonov gland (Apis mellifera, 1-7 weeks, female, SAMN02719778)	199,522,044	72%	22%	87,212
SAMN02727986	NA	Midgut (Apis mellifera, 1-7 weeks, female, SAMN02727986)	142,395,500	63%	19%	71,597
SAMN02727996	NA	Sting Gland (Apis mellifera, 1-7 weeks, female, SAMN02727996)	47,314,632	76%	20%	82,884
SAMN02728789	NA	Sting Gland (Apis mellifera, 1-7 weeks, female, SAMN02728789)	53,098,508	34%	17%	66,574
SAMN02728790	NA	Midgut (Apis mellifera, 1-7 weeks, female, SAMN02728790)	170,558,138	58%	19%	70,802
SAMN02728791	NA	Antenna (Apis mellifera, 1-7 weeks, female, SAMN02728791)	214,670,402	64%	21%	96,140
SAMN02728792	NA	Nasonov gland (Apis mellifera, 1-7 weeks, female, SAMN02728792)	215,789,000	62%	21%	85,361
SAMN02728794	NA	Mandibular gland (Apis mellifera, 1-7 weeks, female, SAMN02728794)	239,415,666	67%	20%	81,743
SAMN02728800	NA	Hypopharyngeal gland (Apis mellifera, 1-7 weeks, female, SAMN02728800)	255,061,692	72%	26%	71,946
SAMN02728801	NA	Malpighian tubule (Apis mellifera, 1-7 weeks, female, SAMN02728801)	274,744,652	71%	20%	86,283
SAMN02728802	NA	Muscle (Apis mellifera, 1-7 weeks, female, SAMN02728802)	199,042,116	55%	15%	75,149
SAMN02728803	NA	Second Thoracic Ganglia (Apis mellifera, 1-7 weeks, female, SAMN02728803)	203,222,838	74%	18%	96,407
SAMN04244121	NA	Brain (Apis mellifera carnica, female, SAMN04244121)	218,188,846	80%	22%	93,938
SAMN04244122	NA	Brain (Apis mellifera carnica, female, SAMN04244122)	220,797,918	82%	21%	95,305
SAMN04244123	NA	Brain (Apis mellifera carnica, female, SAMN04244123)	215,385,931	82%	22%	95,027
SAMN05199459	25553907,21857981	antennae (Apis cerana, SAMN05199459)	28,727,056	70%	19%	66,437
SAMN05199460	25553907,21857981	brain (Apis cerana, SAMN05199460)	28,067,672	74%	17%	67,094
SAMN05199461	25553907,21857981	fat body (Apis cerana, SAMN05199461)	124,626,606	70%	18%	76,720
SAMN05199462	25553907,21857981	gut (Apis cerana, SAMN05199462)	52,489,846	62%	20%	55,422
SAMN05199463	25553907,21857981	head (Apis cerana, SAMN05199463)	271,752	32%	84%	14,889
SAMN05199464	25553907,21857981	hypopharyngeal gland (Apis cerana, SAMN05199464)	28,143,550	75%	23%	45,018
SAMN05199465	25553907,21857981	venom gland (Apis cerana, SAMN05199465)	175,970,162	75%	18%	78,169
SAMN05199466	25553907,21857981	antennae (Apis mellifera, SAMN05199466)	25,981,706	48%	20%	70,505
SAMN05199467	25553907,21857981	brain (Apis mellifera, SAMN05199467)	27,286,036	70%	22%	74,000
SAMN05199468	25553907,21857981	hypopharyngeal gland (Apis mellifera, SAMN05199468)	27,922,064	76%	29%	47,526
SAMN05225359	NA	antennae (Apis florea, male, SAMN05225359)	53,370,366	67%	28%	72,348
SAMN05225360	NA	antennae (Apis florea, male, SAMN05225360)	52,214,296	68%	28%	72,086
SAMN05225361	NA	antennae (Apis florea, female, SAMN05225361)	65,788,708	67%	26%	73,913
SAMN05225362	NA	antennae (Apis florea, female, SAMN05225362)	44,751,494	67%	26%	70,967
SAMN06192067	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192067)	90,240,988	81%	23%	90,655
SAMN06192068	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192068)	105,607,542	72%	21%	87,648
SAMN06192069	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192069)	87,929,650	82%	22%	88,244
SAMN06192070	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192070)	148,447,384	49%	21%	88,399
SAMN06192071	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192071)	83,810,010	80%	22%	87,325
SAMN06192072	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192072)	100,160,218	76%	22%	88,366
SAMN06192073	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192073)	128,177,728	80%	21%	91,469
SAMN06192074	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192074)	98,132,370	81%	22%	87,828
SAMN06192075	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192075)	104,160,918	78%	23%	97,934
SAMN06192076	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192076)	101,848,792	81%	25%	96,185
SAMN06192077	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192077)	99,634,250	81%	23%	97,651
SAMN06192078	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192078)	59,057,944	80%	2%	48,329
SAMN06192079	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192079)	95,494,324	78%	22%	89,880
SAMN06192080	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192080)	60,314,752	89%	2%	48,962
SAMN06192081	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192081)	59,045,000	81%	2%	49,373
SAMN06192082	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192082)	56,668,426	89%	3%	58,246
SAMN06192083	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192083)	62,554,784	86%	4%	64,132
SAMN06192084	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192084)	59,872,588	87%	4%	60,756
SAMN06192085	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192085)	58,556,824	87%	3%	51,090
SAMN06192086	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192086)	75,071,086	88%	2%	56,008
SAMN06192087	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192087)	60,368,568	90%	2%	50,595
SAMN06192088	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192088)	55,776,806	87%	1%	43,949
SAMN06192089	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192089)	79,136,518	93%	0%	36,449
SAMN06192090	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192090)	56,359,504	87%	2%	45,306
SAMN06242032	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06242032)	12,480,712	3%	3%	343
SAMN06242033	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06242033)	12,569,905	3%	4%	317

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR035881	SRX016658	SRP001899	SAMN00008880	94,264	80%	58%
SRR035924	SRX016658	SRP001899	SAMN00008880	1,263,141	80%	59%
SRR063955	SRX025526	SRP003260	SAMN02117707	396,725	59%	28%
SRR063954	SRX025527	SRP003260	SAMN02117707	418,025	40%	22%
SRR063956	SRX025528	SRP003260	SAMN02117708	622,279	51%	38%
SRR063949	SRX025529	SRP003261	SAMN02138536	1,135,230	74%	55%
SRR063951	SRX025530	SRP003261	SAMN02138537	1,136,885	74%	54%
SRR063947	SRX025531	SRP003261	SAMN02138538	737,245	68%	36%
SRR063948	SRX025531	SRP003261	SAMN02138538	818,994	68%	31%
SRR063957	SRX025532	SRP003261	SAMN02138539	1,317,973	76%	58%
SRR063950	SRX025533	SRP003261	SAMN02138540	1,088,688	80%	60%
SRR071803	SRX030487	SRP003528	SAMN00113341	4,712,843	72%	14%
SRR071804	SRX030487	SRP003528	SAMN00113341	5,199,336	73%	15%
SRR071805	SRX030488	SRP003528	SAMN00120364	9,906,867	75%	14%
SRR071806	SRX030488	SRP003528	SAMN00120364	11,063,546	73%	14%
SRR071807	SRX030488	SRP003528	SAMN00120364	9,710,134	74%	14%
SRR071808	SRX030489	SRP003528	SAMN00120365	8,482,588	74%	14%
SRR071809	SRX030489	SRP003528	SAMN00120365	8,428,927	74%	14%
SRR071810	SRX030490	SRP003528	SAMN00120366	7,831,186	74%	15%
SRR071811	SRX030490	SRP003528	SAMN00120366	6,683,125	72%	15%
SRR071812	SRX030491	SRP003528	SAMN00120367	5,637,292	67%	15%
SRR071813	SRX030491	SRP003528	SAMN00120367	5,814,625	68%	15%
SRR071814	SRX030492	SRP003528	SAMN00120368	7,852,612	71%	15%
SRR071815	SRX030492	SRP003528	SAMN00120368	7,885,506	72%	15%
SRR071816	SRX030493	SRP003528	SAMN00120369	7,877,821	73%	15%
SRR071817	SRX030493	SRP003528	SAMN00120369	6,801,193	73%	15%
SRR071818	SRX030494	SRP003528	SAMN00120370	6,739,570	70%	15%
SRR071819	SRX030494	SRP003528	SAMN00120370	7,178,141	72%	15%
SRR071820	SRX030495	SRP003528	SAMN00120371	8,757,199	72%	15%
SRR071821	SRX030495	SRP003528	SAMN00120371	7,926,470	72%	15%
SRR071822	SRX030495	SRP003528	SAMN00120371	8,757,278	71%	15%
SRR071823	SRX030496	SRP003528	SAMN00120372	10,333,866	71%	14%
SRR071824	SRX030496	SRP003528	SAMN00120372	9,136,093	72%	15%
SRR071825	SRX030496	SRP003528	SAMN00120372	9,097,017	72%	15%
SRR098291	SRX040731	SRP005626	SAMN00205597	269,742	55%	29%
SRR098292	SRX040731	SRP005626	SAMN00205597	61,301	54%	25%
SRR361851	SRX104370	SRP009193	SAMN00750149	469,162	68%	42%
SRR498622	SRX148699	SRP013261	SAMN00996631	38,230,006	78%	21%
SRR499808	SRX148699	SRP013261	SAMN00996631	40,959,240	78%	21%
SRR499882	SRX148699	SRP013261	SAMN00996631	32,336,648	76%	20%
SRR499883	SRX148699	SRP013261	SAMN00996631	37,025,856	78%	21%
SRR499919	SRX148699	SRP013261	SAMN00996631	36,316,808	78%	21%
SRR499920	SRX148699	SRP013261	SAMN00996631	37,847,774	77%	21%
SRR499992	SRX148699	SRP013261	SAMN00996631	38,449,538	79%	21%
SRR499993	SRX148699	SRP013261	SAMN00996631	36,704,784	78%	21%
SRR499994	SRX148699	SRP013261	SAMN00996631	47,488,940	24%	21%
SRR499995	SRX148699	SRP013261	SAMN00996631	34,686,976	79%	21%
SRR789759	SRX253221	SRP019928	SAMN01985568	175,796,462	82%	22%
SRR789760	SRX253222	SRP019928	SAMN01985569	168,267,063	83%	22%
SRR789761	SRX253223	SRP019928	SAMN01985570	178,934,269	83%	22%
SRR789762	SRX253224	SRP019928	SAMN01985571	189,487,841	82%	22%
SRR789763	SRX253225	SRP019928	SAMN01985572	162,754,857	82%	22%
SRR789764	SRX253226	SRP019928	SAMN01985573	153,692,102	82%	22%
SRR2954344	SRX1420589	SRP019928	SAMN04244121	218,188,846	80%	22%
SRR2954345	SRX1420590	SRP019928	SAMN04244122	220,797,918	82%	21%
SRR2954346	SRX1420591	SRP019928	SAMN04244123	215,385,931	82%	22%
SRR1255541	SRX518065	SRP041189	SAMN02717013	68,678,890	75%	18%
SRR1255542	SRX518066	SRP041189	SAMN02717013	73,889,574	75%	18%
SRR1255543	SRX518067	SRP041189	SAMN02717013	69,568,686	74%	18%
SRR1255064	SRX518062	SRP041189	SAMN02719773	104,850,402	75%	23%
SRR1255065	SRX519341	SRP041189	SAMN02719773	68,997,582	79%	22%
SRR1255066	SRX519342	SRP041189	SAMN02719773	60,957,858	79%	21%
SRR1254954	SRX518060	SRP041189	SAMN02719774	102,186,182	78%	20%
SRR1254956	SRX518724	SRP041189	SAMN02719774	95,800,088	77%	20%
SRR1254957	SRX518725	SRP041189	SAMN02719774	78,254,338	72%	20%
SRR1254946	SRX518059	SRP041189	SAMN02719775	84,696,498	75%	28%
SRR1254947	SRX518716	SRP041189	SAMN02719775	71,604,084	76%	28%
SRR1254948	SRX518717	SRP041189	SAMN02719775	100,771,288	76%	27%
SRR1255009	SRX518061	SRP041189	SAMN02719776	56,342,908	74%	19%
SRR1255010	SRX519335	SRP041189	SAMN02719776	80,527,564	71%	19%
SRR1255011	SRX519336	SRP041189	SAMN02719776	113,374,830	72%	19%
SRR1239302	SRX518058	SRP041189	SAMN02719777	64,110,330	38%	22%
SRR1239303	SRX518341	SRP041189	SAMN02719777	73,361,990	69%	21%
SRR1239304	SRX518342	SRP041189	SAMN02719777	68,187,072	69%	21%
SRR1255151	SRX518063	SRP041189	SAMN02719778	59,131,754	74%	22%
SRR1255152	SRX519410	SRP041189	SAMN02719778	68,500,084	71%	21%
SRR1255153	SRX519423	SRP041189	SAMN02719778	71,890,206	70%	22%
SRR1239308	SRX517566	SRP041189	SAMN02727986	42,519,996	65%	18%
SRR1239309	SRX518336	SRP041189	SAMN02727986	45,329,928	62%	19%
SRR1239310	SRX518337	SRP041189	SAMN02727986	54,545,576	61%	19%
SRR1269199	SRX518064	SRP041189	SAMN02727996	47,314,632	76%	20%
SRR1255456	SRX519429	SRP041189	SAMN02728789	53,098,508	34%	17%
SRR1239311	SRX518338	SRP041189	SAMN02728790	51,534,456	49%	16%
SRR1239312	SRX518339	SRP041189	SAMN02728790	43,564,950	59%	19%
SRR1239313	SRX518340	SRP041189	SAMN02728790	75,458,732	64%	20%
SRR1239305	SRX518343	SRP041189	SAMN02728791	68,058,182	72%	22%
SRR1239306	SRX518344	SRP041189	SAMN02728791	67,841,628	71%	21%
SRR1239307	SRX518345	SRP041189	SAMN02728791	78,770,592	52%	21%
SRR1255154	SRX519424	SRP041189	SAMN02728792	58,745,720	47%	21%
SRR1255260	SRX519425	SRP041189	SAMN02728792	86,045,054	73%	21%
SRR1255326	SRX519427	SRP041189	SAMN02728792	70,998,226	62%	21%
SRR1255012	SRX519338	SRP041189	SAMN02728794	59,049,388	65%	20%
SRR1255013	SRX519339	SRP041189	SAMN02728794	63,975,208	72%	20%
SRR1255014	SRX519340	SRP041189	SAMN02728794	116,391,070	64%	21%
SRR1254950	SRX518718	SRP041189	SAMN02728800	68,152,256	68%	26%
SRR1254951	SRX518719	SRP041189	SAMN02728800	99,894,982	71%	26%
SRR1254952	SRX518720	SRP041189	SAMN02728800	87,014,454	75%	27%
SRR1254958	SRX518721	SRP041189	SAMN02728801	100,449,442	69%	19%
SRR1254959	SRX518722	SRP041189	SAMN02728801	88,821,448	76%	20%
SRR1254960	SRX518723	SRP041189	SAMN02728801	85,473,762	68%	20%
SRR1255068	SRX519343	SRP041189	SAMN02728802	72,782,026	48%	16%
SRR1255149	SRX519344	SRP041189	SAMN02728802	68,118,350	66%	15%
SRR1255150	SRX519345	SRP041189	SAMN02728802	58,141,740	49%	15%
SRR1255544	SRX519434	SRP041189	SAMN02728803	65,110,644	77%	19%
SRR1255545	SRX519436	SRP041189	SAMN02728803	74,756,862	75%	18%
SRR1255546	SRX519438	SRP041189	SAMN02728803	63,355,332	68%	17%
SRR1653580	SRX661188	SRP043101	SAMN02592885	77,898,496	27%	22%
SRR1653592	SRX661188	SRP043101	SAMN02592885	49,363,940	83%	23%
SRR1653605	SRX661188	SRP043101	SAMN02592885	50,927,126	79%	21%
SRR1380976	SRX747554	SRP043101	SAMN05199459	28,727,056	70%	19%
SRR1380970	SRX589473	SRP043101	SAMN05199460	28,067,672	74%	17%
SRR1388774	SRX747558	SRP043101	SAMN05199461	124,626,606	70%	18%
SRR1380984	SRX747556	SRP043101	SAMN05199462	52,489,846	62%	20%
SRR1408614	SRX602814	SRP043101	SAMN05199463	271,752	32%	84%
SRR1380979	SRX747555	SRP043101	SAMN05199464	28,143,550	75%	23%
SRR1406762	SRX747559	SRP043101	SAMN05199465	175,970,162	75%	18%
SRR1408090	SRX747561	SRP043101	SAMN05199466	25,981,706	48%	20%
SRR1386316	SRX747557	SRP043101	SAMN05199467	27,286,036	70%	22%
SRR1407793	SRX747560	SRP043101	SAMN05199468	27,922,064	76%	29%
SRR3657423	SRX1837096	SRP076380	SAMN05225359	53,370,366	67%	28%
SRR3657424	SRX1837097	SRP076380	SAMN05225360	52,214,296	68%	28%
SRR3657425	SRX1837098	SRP076380	SAMN05225361	65,788,708	67%	26%
SRR3657426	SRX1837099	SRP076380	SAMN05225362	44,751,494	67%	26%
SRR5136456	SRX2452410	SRP095846	SAMN06192067	90,240,988	81%	23%
SRR5136457	SRX2452411	SRP095846	SAMN06192068	105,607,542	72%	21%
SRR5136458	SRX2452412	SRP095846	SAMN06192069	87,929,650	82%	22%
SRR5136459	SRX2452413	SRP095846	SAMN06192070	148,447,384	49%	21%
SRR5136455	SRX2452409	SRP095846	SAMN06192071	83,810,010	80%	22%
SRR5136453	SRX2452407	SRP095846	SAMN06192072	100,160,218	76%	22%
SRR5136452	SRX2452406	SRP095846	SAMN06192073	128,177,728	80%	21%
SRR5136451	SRX2452405	SRP095846	SAMN06192074	98,132,370	81%	22%
SRR5136450	SRX2452404	SRP095846	SAMN06192075	104,160,918	78%	23%
SRR5136449	SRX2452403	SRP095846	SAMN06192076	101,848,792	81%	25%
SRR5136448	SRX2452402	SRP095846	SAMN06192077	99,634,250	81%	23%
SRR5136454	SRX2452408	SRP095846	SAMN06192079	95,494,324	78%	22%
SRR5188648	SRX2504569	SRP095846	SAMN06242032	12,480,712	3%	3%
SRR5188647	SRX2504568	SRP095846	SAMN06242033	12,569,905	3%	4%
SRR6047317	SRX3194368	SRP117804	SAMN06192078	59,057,944	80%	2%
SRR6047316	SRX3194367	SRP117804	SAMN06192080	60,314,752	89%	2%
SRR6047315	SRX3194366	SRP117804	SAMN06192081	59,045,000	81%	2%
SRR6047314	SRX3194365	SRP117804	SAMN06192082	56,668,426	89%	3%
SRR6047313	SRX3194364	SRP117804	SAMN06192083	62,554,784	86%	4%
SRR6047312	SRX3194363	SRP117804	SAMN06192084	59,872,588	87%	4%
SRR6047311	SRX3194362	SRP117804	SAMN06192085	58,556,824	87%	3%
SRR6047310	SRX3194361	SRP117804	SAMN06192086	75,071,086	88%	2%
SRR6047309	SRX3194360	SRP117804	SAMN06192087	60,368,568	90%	2%
SRR6047308	SRX3194359	SRP117804	SAMN06192088	55,776,806	87%	1%
SRR6047307	SRX3194358	SRP117804	SAMN06192089	79,136,518	93%	0%
SRR6047306	SRX3194357	SRP117804	SAMN06192090	56,359,504	87%	2%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Insecta GenBank	102,985	74,106 (71.96%)	74,106 (71.96%)	66.04%	63.54%
Insecta known RefSeq (NP_)	38,944	27,024 (69.39%)	27,024 (69.39%)	63.87%	56.75%
Apis mellifera high-quality model RefSeq (XP_)	8,880	8,801 (99.11%)	8,801 (99.11%)	92.15%	94.66%
Same-species GenBank	39	39 (100.00%)	39 (100.00%)	83.34%	93.08%
Same-species known RefSeq (NP_)	2	2 (100.00%)	2 (100.00%)	91.33%	96.66%

Comparison of the current and previous annotations

The annotation produced for this release (101) was compared to the annotation in the previous release (100) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Apis_dorsata_1.3 (Current) to Apis_dorsata_1.3 (Previous)
Identical	3%
Minor changes	66%
Major changes	13%
New	15%
Deprecated	10%
Other	2%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences