NCBI Apis florea Annotation Release 102

The RefSeq genome records for Apis florea were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Apis florea Annotation Release 102

Annotation release ID: 102
Date of Entrez queries for transcripts and proteins: Nov 25 2019
Date of submission of annotation to the public databases: Dec 23 2019
Software version: 8.3

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Aflo_1.1	GCF_000184785.3	Baylor College of Medicine	12-07-2012	Reference	1 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Aflo_1.1
Genes and pseudogenes	12,573
protein-coding	10,417
non-coding	2,113
transcribed pseudogenes	2
non-transcribed pseudogenes	41
genes with variants	3,635
immunoglobulin/T-cell receptor gene segments	0
other	0
mRNAs	18,615
fully-supported	17,902
with > 5% ab initio	297
partial	352
with filled gap(s)	97
known RefSeq (NM_)	5
model RefSeq (XM_)	18,610
non-coding RNAs	3,057
fully-supported	2,787
with > 5% ab initio	0
partial	1
with filled gap(s)	1
known RefSeq (NR_)	0
model RefSeq (XR_)	2,840
pseudo transcripts	2
fully-supported	2
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	2
CDSs	18,628
fully-supported	17,902
with > 5% ab initio	340
partial	339
with major correction(s)	2,127
known RefSeq (NP_)	5
model RefSeq (XP_)	18,623

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	12,530	12,292	3,346	68	793,122
All transcripts	21,672	2,413	1,902	48	55,343
mRNA	18,615	2,588	2,046	199	55,343
misc_RNA	559	2,568	2,165	179	8,685
tRNA	215	73	73	60	85
lncRNA	2,233	1,186	858	48	8,572
snoRNA	14	120	91	68	216
snRNA	23	146	149	107	196
guide_RNA	1	128	128	128	128
rRNA	12	604	119	119	4,031
Single-exon transcripts	404	1,483	1,174	233	7,949
coding transcripts (NM_/XM_ )	402	1,484	1,174	233	7,949
non-coding transcripts (NR_/XR_ )	2	1,130	1,919	340	1,919
CDSs	18,628	1,844	1,401	156	54,654
Exons	90,183	335	211	1	11,856
in coding transcripts (NM_/XM_ )	83,636	329	210	1	11,856
in non-coding transcripts (NR_/XR_ )	8,876	373	216	2	7,912
Introns	74,772	2,249	145	30	574,220
in coding transcripts (NM_/XM_ )	70,694	2,183	137	30	574,220
in non-coding transcripts (NR_/XR_ )	6,361	3,056	299	30	413,579

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	1.74	1	1	50
Number of exons per transcript	7.81	6	1	174

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the Drosophila melanogaster known RefSeq proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 10404 coding genes, 7987 genes had a protein with an alignment covering 50% or more of the query and 2460 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: Drosophila melanogaster known RefSeq proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Aflo_1.1	GCF_000184785.3	5.16%	37.94%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	5	5 (100.00%)	5 (100.00%)	99.98%	100.00%
Same-species Genbank	5	5 (100.00%)	5 (100.00%)	99.38%	100.00%
Apis mellifera known RefSeq (NM_/NR_)	786	690 (87.79%)	555 (70.61%)	94.91%	95.52%
Apis mellifera Genbank	733	646 (88.13%)	377 (51.43%)	94.24%	94.68%
Apis mellifera EST	169,408	103,943 (61.36%)	89,713 (52.96%)	94.35%	97.21%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	8,831,268,567	73%	19%	117,402
SAMN00008880	NA	queen ovary (Apis mellifera, SAMN00008880)	1,357,405	77%	60%	56,635
SAMN00113341	NA	nurse brain #1 (Apis mellifera, SAMN00113341)	9,912,179	70%	14%	56,975
SAMN00120364	NA	nurse brain #2 (Apis mellifera, SAMN00120364)	30,680,547	71%	14%	70,594
SAMN00120365	NA	nurse brain #3 (Apis mellifera, SAMN00120365)	16,911,515	70%	14%	66,938
SAMN00120366	NA	nurse brain #4 (Apis mellifera, SAMN00120366)	14,514,311	70%	15%	66,566
SAMN00120367	NA	nurse brain #5 (Apis mellifera, SAMN00120367)	11,451,917	65%	15%	60,267
SAMN00120368	NA	forager brain #1 (Apis mellifera, SAMN00120368)	15,738,118	68%	14%	62,908
SAMN00120369	NA	forager brain #2 (Apis mellifera, SAMN00120369)	14,679,014	70%	15%	65,941
SAMN00120370	NA	forager brain #3 (Apis mellifera, SAMN00120370)	13,917,711	68%	15%	61,553
SAMN00120371	NA	forager brain #4 (Apis mellifera, SAMN00120371)	25,440,947	69%	15%	67,506
SAMN00120372	NA	forager brain #5 (Apis mellifera, SAMN00120372)	28,566,976	68%	14%	68,696
SAMN00205597	NA	Apis florea ESTs (Apis florea, SAMN00205597)	331,043	71%	24%	21,485
SAMN00750149	NA	brain (Apis cerana cerana, SAMN00750149)	469,162	64%	43%	44,572
SAMN00996631	23254332	generic sample (Apis mellifera, SAMN00996631)	380,046,570	69%	20%	90,523
SAMN01985568	NA	Brain (Apis mellifera carnica, female, SAMN01985568)	175,796,462	81%	21%	91,237
SAMN01985569	NA	Brain (Apis mellifera carnica, female, SAMN01985569)	168,267,063	82%	22%	90,902
SAMN01985570	NA	Brain (Apis mellifera carnica, female, SAMN01985570)	178,934,269	81%	22%	91,281
SAMN01985571	NA	Brain (Apis mellifera carnica, male, SAMN01985571)	189,487,841	81%	21%	90,958
SAMN01985572	NA	Brain (Apis mellifera carnica, male, SAMN01985572)	162,754,857	81%	21%	90,513
SAMN01985573	NA	Brain (Apis mellifera carnica, male, SAMN01985573)	153,692,102	81%	22%	89,539
SAMN02117707	24479613	Whole body (Apis mellifera, SAMN02117707)	814,750	62%	19%	22,910
SAMN02117708	24479613	Abdomen (Apis mellifera, SAMN02117708)	622,279	53%	30%	23,479
SAMN02138536	24479613	Antennae (Apis mellifera, SAMN02138536)	1,135,230	68%	55%	32,204
SAMN02138537	24479613	embryo (Apis mellifera, SAMN02138537)	1,136,885	70%	53%	38,509
SAMN02138538	24479613	Brain and ovary (Apis mellifera, SAMN02138538)	1,556,239	67%	32%	52,086
SAMN02138539	24479613	Testes (Apis mellifera, SAMN02138539)	1,317,973	73%	58%	28,374
SAMN02138540	24479613	Larvae (Apis mellifera, SAMN02138540)	1,088,688	78%	60%	27,323
SAMN02592885	25553907,21857981	General Sample for Apis cerana (Apis cerana, SAMN02592885)	178,189,562	55%	21%	78,472
SAMN02717013	NA	2nd Thoracic Ganglia (Apis mellifera, 1-7 weeks, female, SAMN02717013)	212,137,150	74%	17%	91,473
SAMN02719773	NA	Muscle (Apis mellifera, 1-7 weeks, female, SAMN02719773)	234,805,842	76%	22%	78,122
SAMN02719774	NA	Malpighian tubule (Apis mellifera, 1-7 weeks, female, SAMN02719774)	276,240,608	74%	20%	82,795
SAMN02719775	NA	Hypopharyngeal gland (Apis mellifera, 1-7 weeks, female, SAMN02719775)	257,071,870	72%	28%	73,803
SAMN02719776	NA	Mandibular gland (Apis mellifera, 1-7 weeks, female, SAMN02719776)	250,245,302	72%	18%	81,801
SAMN02719777	NA	Antennae (Apis mellifera, 1-7 weeks, female, SAMN02719777)	205,659,392	58%	20%	90,005
SAMN02719778	NA	Nasonov gland (Apis mellifera, 1-7 weeks, female, SAMN02719778)	199,522,044	70%	20%	82,478
SAMN02727986	NA	Midgut (Apis mellifera, 1-7 weeks, female, SAMN02727986)	142,395,500	64%	17%	68,942
SAMN02727996	NA	Sting Gland (Apis mellifera, 1-7 weeks, female, SAMN02727996)	47,314,632	73%	19%	78,422
SAMN02728789	NA	Sting Gland (Apis mellifera, 1-7 weeks, female, SAMN02728789)	53,098,508	29%	19%	63,958
SAMN02728790	NA	Midgut (Apis mellifera, 1-7 weeks, female, SAMN02728790)	170,558,138	67%	15%	68,039
SAMN02728791	NA	Antenna (Apis mellifera, 1-7 weeks, female, SAMN02728791)	214,670,402	62%	20%	90,693
SAMN02728792	NA	Nasonov gland (Apis mellifera, 1-7 weeks, female, SAMN02728792)	215,789,000	61%	20%	80,976
SAMN02728794	NA	Mandibular gland (Apis mellifera, 1-7 weeks, female, SAMN02728794)	239,415,666	65%	20%	77,923
SAMN02728800	NA	Hypopharyngeal gland (Apis mellifera, 1-7 weeks, female, SAMN02728800)	255,061,692	68%	27%	68,926
SAMN02728801	NA	Malpighian tubule (Apis mellifera, 1-7 weeks, female, SAMN02728801)	274,744,652	69%	19%	81,750
SAMN02728802	NA	Muscle (Apis mellifera, 1-7 weeks, female, SAMN02728802)	199,042,116	54%	15%	71,731
SAMN02728803	NA	Second Thoracic Ganglia (Apis mellifera, 1-7 weeks, female, SAMN02728803)	203,222,838	72%	17%	90,812
SAMN04244121	NA	Brain (Apis mellifera carnica, female, SAMN04244121)	218,188,846	78%	21%	88,871
SAMN04244122	NA	Brain (Apis mellifera carnica, female, SAMN04244122)	220,797,918	80%	21%	89,903
SAMN04244123	NA	Brain (Apis mellifera carnica, female, SAMN04244123)	215,385,931	81%	21%	89,765
SAMN05199459	25553907,21857981	antennae (Apis cerana, SAMN05199459)	28,727,056	67%	18%	64,036
SAMN05199460	25553907,21857981	brain (Apis cerana, SAMN05199460)	28,067,672	72%	16%	64,658
SAMN05199461	25553907,21857981	fat body (Apis cerana, SAMN05199461)	124,626,606	66%	18%	73,559
SAMN05199462	25553907,21857981	gut (Apis cerana, SAMN05199462)	52,489,846	58%	18%	53,858
SAMN05199463	25553907,21857981	head (Apis cerana, SAMN05199463)	271,752	35%	73%	14,371
SAMN05199464	25553907,21857981	hypopharyngeal gland (Apis cerana, SAMN05199464)	28,143,550	75%	22%	43,737
SAMN05199465	25553907,21857981	venom gland (Apis cerana, SAMN05199465)	175,970,162	69%	18%	74,970
SAMN05199466	25553907,21857981	antennae (Apis mellifera, SAMN05199466)	25,981,706	47%	19%	67,741
SAMN05199467	25553907,21857981	brain (Apis mellifera, SAMN05199467)	27,286,036	69%	22%	70,712
SAMN05199468	25553907,21857981	hypopharyngeal gland (Apis mellifera, SAMN05199468)	27,922,064	76%	31%	46,141
SAMN05225359	NA	antennae (Apis florea, male, SAMN05225359)	53,370,366	83%	28%	73,463
SAMN05225360	NA	antennae (Apis florea, male, SAMN05225360)	52,214,296	85%	28%	73,279
SAMN05225361	NA	antennae (Apis florea, female, SAMN05225361)	65,788,708	84%	26%	75,119
SAMN05225362	NA	antennae (Apis florea, female, SAMN05225362)	44,751,494	85%	26%	72,455
SAMN06192067	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192067)	90,240,988	81%	22%	84,422
SAMN06192068	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192068)	105,607,542	74%	20%	81,899
SAMN06192069	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192069)	87,929,650	83%	21%	82,409
SAMN06192070	29018616	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192070)	148,447,384	52%	19%	82,622
SAMN06192071	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192071)	83,810,010	81%	21%	81,514
SAMN06192072	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192072)	100,160,218	80%	20%	82,438
SAMN06192073	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192073)	128,177,728	82%	20%	85,214
SAMN06192074	29018616	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192074)	98,132,370	81%	21%	82,087
SAMN06192075	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192075)	104,160,918	76%	23%	91,203
SAMN06192076	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192076)	101,848,792	79%	25%	89,464
SAMN06192077	29018616	ovaries (Apis mellifera, virgin queen, SAMN06192077)	99,634,250	79%	23%	90,853
SAMN06192078	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192078)	59,057,944	79%	2%	45,018
SAMN06192079	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192079)	95,494,324	79%	21%	83,715
SAMN06192080	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192080)	60,314,752	88%	2%	45,763
SAMN06192081	29709513	ovaries (Apis mellifera, egg-laying recovered queen, SAMN06192081)	59,045,000	81%	2%	46,426
SAMN06192082	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192082)	56,668,426	88%	3%	54,792
SAMN06192083	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192083)	62,554,784	85%	4%	60,183
SAMN06192084	29709513	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06192084)	59,872,588	87%	3%	56,862
SAMN06192085	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192085)	58,556,824	88%	3%	47,898
SAMN06192086	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192086)	75,071,086	87%	2%	52,704
SAMN06192087	29709513	ovaries (Apis mellifera, normal egg-laying queen, SAMN06192087)	60,368,568	89%	2%	47,626
SAMN06192088	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192088)	55,776,806	87%	1%	40,520
SAMN06192089	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192089)	79,136,518	93%	0%	34,137
SAMN06192090	29709513	ovaries (Apis mellifera, virgin queen, SAMN06192090)	56,359,504	88%	2%	42,152
SAMN06242032	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06242032)	12,480,712	3%	10%	322
SAMN06242033	29018616	ovaries (Apis mellifera, egg-laying inhibited queen, SAMN06242033)	12,569,905	3%	11%	328

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR035881	SRX016658	SRP001899	SAMN00008880	94,264	77%	59%
SRR035924	SRX016658	SRP001899	SAMN00008880	1,263,141	77%	60%
SRR063955	SRX025526	SRP003260	SAMN02117707	396,725	54%	28%
SRR063954	SRX025527	SRP003260	SAMN02117707	418,025	69%	12%
SRR063956	SRX025528	SRP003260	SAMN02117708	622,279	53%	30%
SRR063949	SRX025529	SRP003261	SAMN02138536	1,135,230	68%	55%
SRR063951	SRX025530	SRP003261	SAMN02138537	1,136,885	70%	53%
SRR063947	SRX025531	SRP003261	SAMN02138538	737,245	67%	34%
SRR063948	SRX025531	SRP003261	SAMN02138538	818,994	67%	30%
SRR063957	SRX025532	SRP003261	SAMN02138539	1,317,973	73%	58%
SRR063950	SRX025533	SRP003261	SAMN02138540	1,088,688	78%	60%
SRR071803	SRX030487	SRP003528	SAMN00113341	4,712,843	69%	14%
SRR071804	SRX030487	SRP003528	SAMN00113341	5,199,336	70%	14%
SRR071805	SRX030488	SRP003528	SAMN00120364	9,906,867	71%	14%
SRR071806	SRX030488	SRP003528	SAMN00120364	11,063,546	70%	14%
SRR071807	SRX030488	SRP003528	SAMN00120364	9,710,134	71%	14%
SRR071808	SRX030489	SRP003528	SAMN00120365	8,482,588	70%	14%
SRR071809	SRX030489	SRP003528	SAMN00120365	8,428,927	71%	14%
SRR071810	SRX030490	SRP003528	SAMN00120366	7,831,186	71%	15%
SRR071811	SRX030490	SRP003528	SAMN00120366	6,683,125	69%	15%
SRR071812	SRX030491	SRP003528	SAMN00120367	5,637,292	64%	15%
SRR071813	SRX030491	SRP003528	SAMN00120367	5,814,625	65%	15%
SRR071814	SRX030492	SRP003528	SAMN00120368	7,852,612	68%	14%
SRR071815	SRX030492	SRP003528	SAMN00120368	7,885,506	69%	15%
SRR071816	SRX030493	SRP003528	SAMN00120369	7,877,821	70%	15%
SRR071817	SRX030493	SRP003528	SAMN00120369	6,801,193	70%	15%
SRR071818	SRX030494	SRP003528	SAMN00120370	6,739,570	67%	15%
SRR071819	SRX030494	SRP003528	SAMN00120370	7,178,141	69%	15%
SRR071820	SRX030495	SRP003528	SAMN00120371	8,757,199	69%	15%
SRR071821	SRX030495	SRP003528	SAMN00120371	7,926,470	69%	15%
SRR071822	SRX030495	SRP003528	SAMN00120371	8,757,278	68%	15%
SRR071823	SRX030496	SRP003528	SAMN00120372	10,333,866	68%	14%
SRR071824	SRX030496	SRP003528	SAMN00120372	9,136,093	69%	14%
SRR071825	SRX030496	SRP003528	SAMN00120372	9,097,017	69%	14%
SRR098291	SRX040731	SRP005626	SAMN00205597	269,742	72%	24%
SRR098292	SRX040731	SRP005626	SAMN00205597	61,301	69%	20%
SRR361851	SRX104370	SRP009193	SAMN00750149	469,162	64%	43%
SRR498622	SRX148699	SRP013261	SAMN00996631	38,230,006	75%	20%
SRR499808	SRX148699	SRP013261	SAMN00996631	40,959,240	76%	20%
SRR499882	SRX148699	SRP013261	SAMN00996631	32,336,648	74%	19%
SRR499883	SRX148699	SRP013261	SAMN00996631	37,025,856	77%	20%
SRR499919	SRX148699	SRP013261	SAMN00996631	36,316,808	77%	20%
SRR499920	SRX148699	SRP013261	SAMN00996631	37,847,774	75%	20%
SRR499992	SRX148699	SRP013261	SAMN00996631	38,449,538	77%	20%
SRR499993	SRX148699	SRP013261	SAMN00996631	36,704,784	75%	20%
SRR499994	SRX148699	SRP013261	SAMN00996631	47,488,940	23%	21%
SRR499995	SRX148699	SRP013261	SAMN00996631	34,686,976	77%	20%
SRR789759	SRX253221	SRP019928	SAMN01985568	175,796,462	81%	21%
SRR789760	SRX253222	SRP019928	SAMN01985569	168,267,063	82%	22%
SRR789761	SRX253223	SRP019928	SAMN01985570	178,934,269	81%	22%
SRR789762	SRX253224	SRP019928	SAMN01985571	189,487,841	81%	21%
SRR789763	SRX253225	SRP019928	SAMN01985572	162,754,857	81%	21%
SRR789764	SRX253226	SRP019928	SAMN01985573	153,692,102	81%	22%
SRR2954344	SRX1420589	SRP019928	SAMN04244121	218,188,846	78%	21%
SRR2954345	SRX1420590	SRP019928	SAMN04244122	220,797,918	80%	21%
SRR2954346	SRX1420591	SRP019928	SAMN04244123	215,385,931	81%	21%
SRR1255541	SRX518065	SRP041189	SAMN02717013	68,678,890	74%	17%
SRR1255542	SRX518066	SRP041189	SAMN02717013	73,889,574	74%	17%
SRR1255543	SRX518067	SRP041189	SAMN02717013	69,568,686	74%	17%
SRR1255064	SRX518062	SRP041189	SAMN02719773	104,850,402	72%	22%
SRR1255065	SRX519341	SRP041189	SAMN02719773	68,997,582	79%	21%
SRR1255066	SRX519342	SRP041189	SAMN02719773	60,957,858	79%	21%
SRR1254954	SRX518060	SRP041189	SAMN02719774	102,186,182	75%	20%
SRR1254956	SRX518724	SRP041189	SAMN02719774	95,800,088	76%	19%
SRR1254957	SRX518725	SRP041189	SAMN02719774	78,254,338	70%	20%
SRR1254946	SRX518059	SRP041189	SAMN02719775	84,696,498	72%	28%
SRR1254947	SRX518716	SRP041189	SAMN02719775	71,604,084	70%	29%
SRR1254948	SRX518717	SRP041189	SAMN02719775	100,771,288	72%	28%
SRR1255009	SRX518061	SRP041189	SAMN02719776	56,342,908	73%	18%
SRR1255010	SRX519335	SRP041189	SAMN02719776	80,527,564	73%	17%
SRR1255011	SRX519336	SRP041189	SAMN02719776	113,374,830	70%	18%
SRR1239302	SRX518058	SRP041189	SAMN02719777	64,110,330	37%	21%
SRR1239303	SRX518341	SRP041189	SAMN02719777	73,361,990	69%	20%
SRR1239304	SRX518342	SRP041189	SAMN02719777	68,187,072	67%	20%
SRR1255151	SRX518063	SRP041189	SAMN02719778	59,131,754	73%	21%
SRR1255152	SRX519410	SRP041189	SAMN02719778	68,500,084	69%	20%
SRR1255153	SRX519423	SRP041189	SAMN02719778	71,890,206	69%	20%
SRR1239308	SRX517566	SRP041189	SAMN02727986	42,519,996	64%	17%
SRR1239309	SRX518336	SRP041189	SAMN02727986	45,329,928	68%	16%
SRR1239310	SRX518337	SRP041189	SAMN02727986	54,545,576	60%	18%
SRR1269199	SRX518064	SRP041189	SAMN02727996	47,314,632	73%	19%
SRR1255456	SRX519429	SRP041189	SAMN02728789	53,098,508	29%	19%
SRR1239311	SRX518338	SRP041189	SAMN02728790	51,534,456	74%	9%
SRR1239312	SRX518339	SRP041189	SAMN02728790	43,564,950	65%	16%
SRR1239313	SRX518340	SRP041189	SAMN02728790	75,458,732	64%	19%
SRR1239305	SRX518343	SRP041189	SAMN02728791	68,058,182	69%	21%
SRR1239306	SRX518344	SRP041189	SAMN02728791	67,841,628	69%	20%
SRR1239307	SRX518345	SRP041189	SAMN02728791	78,770,592	50%	20%
SRR1255154	SRX519424	SRP041189	SAMN02728792	58,745,720	46%	19%
SRR1255260	SRX519425	SRP041189	SAMN02728792	86,045,054	71%	20%
SRR1255326	SRX519427	SRP041189	SAMN02728792	70,998,226	61%	20%
SRR1255012	SRX519338	SRP041189	SAMN02728794	59,049,388	64%	19%
SRR1255013	SRX519339	SRP041189	SAMN02728794	63,975,208	71%	20%
SRR1255014	SRX519340	SRP041189	SAMN02728794	116,391,070	63%	20%
SRR1254950	SRX518718	SRP041189	SAMN02728800	68,152,256	65%	27%
SRR1254951	SRX518719	SRP041189	SAMN02728800	99,894,982	67%	27%
SRR1254952	SRX518720	SRP041189	SAMN02728800	87,014,454	72%	28%
SRR1254958	SRX518721	SRP041189	SAMN02728801	100,449,442	67%	19%
SRR1254959	SRX518722	SRP041189	SAMN02728801	88,821,448	74%	19%
SRR1254960	SRX518723	SRP041189	SAMN02728801	85,473,762	65%	20%
SRR1255068	SRX519343	SRP041189	SAMN02728802	72,782,026	48%	15%
SRR1255149	SRX519344	SRP041189	SAMN02728802	68,118,350	66%	14%
SRR1255150	SRX519345	SRP041189	SAMN02728802	58,141,740	49%	15%
SRR1255544	SRX519434	SRP041189	SAMN02728803	65,110,644	75%	18%
SRR1255545	SRX519436	SRP041189	SAMN02728803	74,756,862	74%	17%
SRR1255546	SRX519438	SRP041189	SAMN02728803	63,355,332	67%	17%
SRR1653580	SRX661188	SRP043101	SAMN02592885	77,898,496	26%	21%
SRR1653592	SRX661188	SRP043101	SAMN02592885	49,363,940	79%	22%
SRR1653605	SRX661188	SRP043101	SAMN02592885	50,927,126	75%	21%
SRR1380976	SRX747554	SRP043101	SAMN05199459	28,727,056	67%	18%
SRR1380970	SRX589473	SRP043101	SAMN05199460	28,067,672	72%	16%
SRR1388774	SRX747558	SRP043101	SAMN05199461	124,626,606	66%	18%
SRR1380984	SRX747556	SRP043101	SAMN05199462	52,489,846	58%	18%
SRR1408614	SRX602814	SRP043101	SAMN05199463	271,752	35%	73%
SRR1380979	SRX747555	SRP043101	SAMN05199464	28,143,550	75%	22%
SRR1406762	SRX747559	SRP043101	SAMN05199465	175,970,162	69%	18%
SRR1408090	SRX747561	SRP043101	SAMN05199466	25,981,706	47%	19%
SRR1386316	SRX747557	SRP043101	SAMN05199467	27,286,036	69%	22%
SRR1407793	SRX747560	SRP043101	SAMN05199468	27,922,064	76%	31%
SRR3657423	SRX1837096	SRP076380	SAMN05225359	53,370,366	83%	28%
SRR3657424	SRX1837097	SRP076380	SAMN05225360	52,214,296	85%	28%
SRR3657425	SRX1837098	SRP076380	SAMN05225361	65,788,708	84%	26%
SRR3657426	SRX1837099	SRP076380	SAMN05225362	44,751,494	85%	26%
SRR5136456	SRX2452410	SRP095846	SAMN06192067	90,240,988	81%	22%
SRR5136457	SRX2452411	SRP095846	SAMN06192068	105,607,542	74%	20%
SRR5136458	SRX2452412	SRP095846	SAMN06192069	87,929,650	83%	21%
SRR5136459	SRX2452413	SRP095846	SAMN06192070	148,447,384	52%	19%
SRR5136455	SRX2452409	SRP095846	SAMN06192071	83,810,010	81%	21%
SRR5136453	SRX2452407	SRP095846	SAMN06192072	100,160,218	80%	20%
SRR5136452	SRX2452406	SRP095846	SAMN06192073	128,177,728	82%	20%
SRR5136451	SRX2452405	SRP095846	SAMN06192074	98,132,370	81%	21%
SRR5136450	SRX2452404	SRP095846	SAMN06192075	104,160,918	76%	23%
SRR5136449	SRX2452403	SRP095846	SAMN06192076	101,848,792	79%	25%
SRR5136448	SRX2452402	SRP095846	SAMN06192077	99,634,250	79%	23%
SRR5136454	SRX2452408	SRP095846	SAMN06192079	95,494,324	79%	21%
SRR5188648	SRX2504569	SRP095846	SAMN06242032	12,480,712	3%	10%
SRR5188647	SRX2504568	SRP095846	SAMN06242033	12,569,905	3%	11%
SRR6047317	SRX3194368	SRP117804	SAMN06192078	59,057,944	79%	2%
SRR6047316	SRX3194367	SRP117804	SAMN06192080	60,314,752	88%	2%
SRR6047315	SRX3194366	SRP117804	SAMN06192081	59,045,000	81%	2%
SRR6047314	SRX3194365	SRP117804	SAMN06192082	56,668,426	88%	3%
SRR6047313	SRX3194364	SRP117804	SAMN06192083	62,554,784	85%	4%
SRR6047312	SRX3194363	SRP117804	SAMN06192084	59,872,588	87%	3%
SRR6047311	SRX3194362	SRP117804	SAMN06192085	58,556,824	88%	3%
SRR6047310	SRX3194361	SRP117804	SAMN06192086	75,071,086	87%	2%
SRR6047309	SRX3194360	SRP117804	SAMN06192087	60,368,568	89%	2%
SRR6047308	SRX3194359	SRP117804	SAMN06192088	55,776,806	87%	1%
SRR6047307	SRX3194358	SRP117804	SAMN06192089	79,136,518	93%	0%
SRR6047306	SRX3194357	SRP117804	SAMN06192090	56,359,504	88%	2%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Insecta GenBank	103,472	74,630 (72.13%)	74,630 (72.13%)	66.09%	63.38%
Insecta known RefSeq (NP_)	38,941	27,135 (69.68%)	27,135 (69.68%)	63.98%	56.53%
Apis mellifera high-quality model RefSeq (XP_)	8,880	8,817 (99.29%)	8,817 (99.29%)	91.51%	94.61%
Same-species GenBank	4	4 (100.00%)	4 (100.00%)	89.85%	97.85%
Same-species known RefSeq (NP_)	5	5 (100.00%)	5 (100.00%)	79.00%	94.58%

Comparison of the current and previous annotations

The annotation produced for this release (102) was compared to the annotation in the previous release (101) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Aflo_1.1 (Current) to Aflo_1.1 (Previous)
Identical	5%
Minor changes	67%
Major changes	11%
New	13%
Deprecated	8%
Other	3%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences