NCBI Pan paniscus Annotation Release 104

The RefSeq genome records for Pan paniscus were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Pan paniscus Annotation Release 104

Annotation release ID: 104
Date of Entrez queries for transcripts and proteins: May 19 2020
Date of submission of annotation to the public databases: Jun 5 2020
Software version: 8.4

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
Mhudiblu_PPA_v0	GCF_013052645.1	University of Washington	05-15-2020	Reference	25 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	Mhudiblu_PPA_v0
Genes and pseudogenes	38,313
protein-coding	22,366
non-coding	9,066
transcribed pseudogenes	158
non-transcribed pseudogenes	6,578
genes with variants	13,991
immunoglobulin/T-cell receptor gene segments	145
other	0
mRNAs	70,833
fully-supported	69,964
with > 5% ab initio	531
partial	134
with filled gap(s)	0
known RefSeq (NM_)	48
model RefSeq (XM_)	70,785
non-coding RNAs	14,488
fully-supported	11,330
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	166
model RefSeq (XR_)	13,870
pseudo transcripts	165
fully-supported	134
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	3
model RefSeq (XR_)	162
CDSs	70,991
fully-supported	69,964
with > 5% ab initio	614
partial	139
with major correction(s)	974
known RefSeq (NP_)	61
model RefSeq (XP_)	70,785

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	31,432	46,604	13,385	48	2,610,200
All transcripts	85,321	3,320	2,688	18	105,255
mRNA	70,833	3,669	3,002	153	105,255
misc_RNA	3,547	2,792	2,246	101	16,305
miRNA	170	22	22	18	25
tRNA	446	74	73	59	84
lncRNA	7,625	1,728	1,268	80	23,387
snoRNA	1,129	111	104	55	330
snRNA	1,508	109	107	59	199
guide_RNA	45	156	136	83	420
rRNA	18	352	119	119	1,869
Single-exon transcripts	2,514	1,684	1,236	153	12,616
coding transcripts (NM_/XM_ )	2,509	1,685	1,239	153	12,616
non-coding transcripts (NR_/XR_ )	5	1,066	612	359	2,747
CDSs	70,846	1,931	1,434	75	103,878
Exons	292,285	378	146	1	24,900
in coding transcripts (NM_/XM_ )	265,180	366	144	1	24,900
in non-coding transcripts (NR_/XR_ )	43,383	388	147	2	14,299
Introns	252,740	7,936	1,812	30	1,168,934
in coding transcripts (NM_/XM_ )	233,145	7,565	1,756	30	1,168,934
in non-coding transcripts (NR_/XR_ )	35,301	9,501	2,081	31	609,877

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	2.74	1	1	50
Number of exons per transcript	11.27	8	1	322

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 22353 coding genes, 21387 genes had a protein with an alignment covering 50% or more of the query and 18961 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker for each assembly. RepeatMasker results are only used for organisms for which a comprehensive repeat library is available.

For this annotation run, transcripts and proteins were aligned to the genome masked with RepeatMasker only.

Assembly name	Assembly accession	% Masked with RepeatMasker	% Masked with WindowMasker
Mhudiblu_PPA_v0	GCF_013052645.1	45.85%	39.73%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Depending on the other evidence available, long 454 reads (with average length above 250 nt) may be aligned as traditional evidence and reported in the Transcript alignments section or aligned with RNA-Seq reads and reported in the RNA-Seq alignments section.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species known RefSeq (NM_/NR_)	218	218 (100.00%)	217 (99.54%)	99.96%	100.00%
Same-species Genbank	191	171 (89.53%)	162 (84.82%)	99.43%	99.72%
Homo sapiens known RefSeq (NM_/NR_)	74,670	74,173 (99.33%)	64,781 (86.76%)	98.90%	99.33%
Homo sapiens Genbank	322,433	276,363 (85.71%)	194,899 (60.45%)	97.08%	92.92%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	4,928,038,220	67%	21%	416,989
SAMEA5858368	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858368)	31,005,746	49%	26%	171,256
SAMEA5858369	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858369)	30,117,640	59%	27%	177,589
SAMEA5858370	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858370)	35,294,986	57%	26%	178,439
SAMEA5858371	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858371)	33,354,856	55%	26%	176,338
SAMEA5858372	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858372)	34,463,786	56%	28%	179,130
SAMEA5858373	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858373)	33,140,374	55%	28%	177,579
SAMEA5858374	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858374)	36,317,322	57%	28%	181,565
SAMEA5858375	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858375)	29,489,318	57%	28%	173,575
SAMEA5858376	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858376)	32,149,328	55%	28%	174,288
SAMEA5858377	NA	prefrontal cortex (Pan paniscus, 15, male, SAMEA5858377)	29,288,332	58%	27%	174,379
SAMN00632220	22012392	Brain, prefrontal cortex (Pan paniscus, Female, SAMN00632220)	34,332,540	74%	21%	177,751
SAMN00632221	22012392	Brain, prefrontal cortex (Pan paniscus, Female, SAMN00632221)	24,777,783	74%	9%	146,968
SAMN00632222	22012392	Brain, prefrontal cortex (Pan paniscus, Male, SAMN00632222)	38,196,822	70%	12%	171,316
SAMN00632223	22012392	Cerebellum (Pan paniscus, Female, SAMN00632223)	30,345,120	69%	9%	151,662
SAMN00632224	22012392	Cerebellum (Pan paniscus, Male, SAMN00632224)	34,467,310	60%	12%	161,701
SAMN00632225	22012392	Heart (Pan paniscus, Female, SAMN00632225)	29,650,645	69%	13%	133,539
SAMN00632226	22012392	Heart (Pan paniscus, Male, SAMN00632226)	26,025,889	76%	6%	120,633
SAMN00632227	22012392	Kidney (Pan paniscus, Female, SAMN00632227)	30,139,364	69%	10%	147,760
SAMN00632228	22012392	Kidney (Pan paniscus, Male, SAMN00632228)	25,901,079	76%	9%	142,657
SAMN00632229	22012392	Liver (Pan paniscus, Female, SAMN00632229)	28,491,592	71%	15%	117,300
SAMN00632230	22012392	Liver (Pan paniscus, Male, SAMN00632230)	20,161,205	65%	13%	107,474
SAMN00632231	22012392	Testis (Pan paniscus, Male, SAMN00632231)	16,151,902	59%	18%	151,133
SAMN02189949	24153179,27487209,28430982	iPSC, (Pan paniscus, SAMN02189949)	30,297,954	89%	30%	160,010
SAMN02189950	24153179,27487209,28430982	iPSC, (Pan paniscus, SAMN02189950)	36,053,276	89%	31%	170,815
SAMN02189951	24153179,27487209,28430982	iPSC, (Pan paniscus, SAMN02189951)	145,284,828	90%	31%	215,567
SAMN02189952	24153179,27487209,28430982	iPSC, (Pan paniscus, SAMN02189952)	27,397,776	86%	30%	156,643
SAMN02353876	24631741	iPS, (Pan paniscus, SAMN02353876)	9,747,304	78%	14%	116,947
SAMN02353898	24631741	iPS, (Pan paniscus, SAMN02353898)	8,495,653	74%	14%	109,122
SAMN04546399	NA	placenta (Pan paniscus, SAMN04546399)	87,082,582	31%	26%	122,981
SAMN07814255	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814255)	20,944,429	89%	18%	110,545
SAMN07814256	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814256)	26,127,878	90%	18%	113,119
SAMN07814257	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814257)	20,528,640	90%	19%	104,871
SAMN07814258	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814258)	20,027,099	91%	19%	104,695
SAMN07814259	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814259)	19,296,079	91%	19%	104,393
SAMN07814260	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814260)	16,178,247	90%	18%	98,161
SAMN07814261	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814261)	19,638,029	90%	19%	106,432
SAMN07814262	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814262)	18,277,423	90%	18%	103,138
SAMN07814263	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814263)	21,454,733	90%	18%	109,402
SAMN07814264	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814264)	20,573,560	90%	18%	110,300
SAMN07814265	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814265)	22,522,956	90%	18%	112,853
SAMN07814266	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814266)	17,930,444	87%	16%	99,843
SAMN07814267	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814267)	21,552,968	88%	19%	112,948
SAMN07814268	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814268)	23,188,217	85%	18%	45,950
SAMN07814444	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814444)	31,217,802	92%	21%	73,130
SAMN07814445	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814445)	20,288,657	90%	18%	106,263
SAMN07814446	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814446)	22,854,851	90%	19%	111,808
SAMN07814447	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814447)	26,374,761	90%	18%	114,298
SAMN07814475	31649152	primary dermal fibroblasts, (Pan paniscus, SAMN07814475)	4,483,731	60%	18%	68,892
SAMN11165805	NA	BA28: 1 Supramarginal (BA40) (Pan paniscus, SAMN11165805)	38,621,270	71%	25%	179,837
SAMN11165806	NA	BC22: 30 Cerebellar Grey Matter (Pan paniscus, SAMN11165806)	31,021,648	55%	14%	138,753
SAMN11165807	NA	BB22: 30 Cerebellar Grey Matter (Pan paniscus, SAMN11165807)	33,122,852	63%	23%	163,138
SAMN11165808	NA	BA22: 30 Cerebellar Grey Matter (Pan paniscus, SAMN11165808)	36,865,746	69%	28%	183,610
SAMN11165809	NA	BC20: 28 Cerebellar White Matter (Pan paniscus, SAMN11165809)	32,704,030	56%	16%	128,813
SAMN11165810	NA	BB20: 28 Cerebellar White Matter (Pan paniscus, SAMN11165810)	28,975,830	56%	16%	135,636
SAMN11165811	NA	BA20: 28 Cerebellar White Matter (Pan paniscus, SAMN11165811)	36,108,548	68%	23%	163,602
SAMN11165812	NA	BC18: 22 Hypothalamus (Pan paniscus, SAMN11165812)	37,921,460	59%	15%	142,891
SAMN11165813	NA	BB18: 22 Hypothalamus (Pan paniscus, SAMN11165813)	30,626,882	54%	27%	170,954
SAMN11165814	NA	BA18: 22 Hypothalamus (Pan paniscus, SAMN11165814)	42,344,838	67%	21%	185,840
SAMN11165815	NA	BC17: 23 Thalamus (Pan paniscus, SAMN11165815)	31,020,684	57%	15%	137,993
SAMN11165816	NA	BB17: 23 Thalamus (Pan paniscus, SAMN11165816)	39,380,074	71%	27%	176,603
SAMN11165817	NA	BA17: 23 Thalamus (Pan paniscus, SAMN11165817)	33,048,394	68%	22%	170,320
SAMN11165818	NA	BC16: 25 Corpus Callosum Posterior (Pan paniscus, SAMN11165818)	34,309,100	54%	19%	140,729
SAMN11165819	NA	BB16: 25 Corpus Callosum Posterior (Pan paniscus, SAMN11165819)	42,593,130	54%	17%	132,568
SAMN11165820	NA	BA16: 25 Corpus Callosum Posterior (Pan paniscus, SAMN11165820)	27,547,936	53%	15%	126,008
SAMN11165821	NA	BC15: 24 Corpus Callosum Anterior (Pan paniscus, SAMN11165821)	36,029,820	51%	15%	148,693
SAMN11165822	NA	BB15: 24 Corpus Callosum Anterior (Pan paniscus, SAMN11165822)	32,282,048	61%	23%	154,554
SAMN11165823	NA	BA15: 24 Corpus Callosum Anterior (Pan paniscus, SAMN11165823)	48,459,588	72%	23%	192,166
SAMN11165824	NA	BC14: 14 Inferior Temporal (BA20) (Pan paniscus, SAMN11165824)	38,775,898	64%	10%	129,183
SAMN11165825	NA	BB14: 14 Inferior Temporal (BA20) (Pan paniscus, SAMN11165825)	27,476,696	67%	28%	161,427
SAMN11165826	NA	BA14: 14 Inferior Temporal (BA20) (Pan paniscus, SAMN11165826)	29,131,670	65%	22%	161,083
SAMN11165827	NA	BC12: 2 2ary Auditory (BA22) (Pan paniscus, SAMN11165827)	31,929,280	63%	10%	129,751
SAMN11165828	NA	BB12: 2 2ary Auditory (BA22) (Pan paniscus, SAMN11165828)	32,996,960	68%	25%	168,129
SAMN11165829	NA	BA12: 2 2ary Auditory (BA22) (Pan paniscus, SAMN11165829)	94,272,626	67%	24%	203,430
SAMN11165830	NA	BC11: 18 Ventrolateral Prefrontal (BA44) (Pan paniscus, SAMN11165830)	34,294,306	53%	11%	125,575
SAMN11165831	NA	BB11: 18 Ventrolateral Prefrontal (BA44) (Pan paniscus, SAMN11165831)	36,007,136	61%	25%	160,838
SAMN11165832	NA	BA11: 18 Ventrolateral Prefrontal (BA44) (Pan paniscus, SAMN11165832)	34,931,774	69%	26%	177,996
SAMN11165910	NA	BB4: 12 Cingulate Anterior (BA32) (Pan paniscus, SAMN11165910)	35,445,860	63%	22%	163,369
SAMN11165911	NA	BA4: 12 Cingulate Anterior (BA32) (Pan paniscus, SAMN11165911)	35,457,004	72%	26%	181,029
SAMN11165912	NA	BC3: 17 Orbitofrontal (BA11) (Pan paniscus, SAMN11165912)	34,525,428	67%	13%	139,454
SAMN11165913	NA	BB3: 17 Orbitofrontal (BA11) (Pan paniscus, SAMN11165913)	32,092,890	67%	25%	173,437
SAMN11165914	NA	BA3: 17 Orbitofrontal (BA11) (Pan paniscus, SAMN11165914)	33,829,448	72%	25%	177,878
SAMN11165915	NA	BC2: 16 Dorsolateral Prefrontal (BA9) (Pan paniscus, SAMN11165915)	39,231,050	54%	13%	137,343
SAMN11165916	NA	BB2: 16 Dorsolateral Prefrontal (BA9) (Pan paniscus, SAMN11165916)	28,394,546	62%	23%	150,601
SAMN11165917	NA	BA2: 16 Dorsolateral Prefrontal (BA9) (Pan paniscus, SAMN11165917)	39,336,866	69%	27%	182,890
SAMN11165918	NA	BC1: 13 Prefrontal (BA10) (Pan paniscus, SAMN11165918)	29,974,768	66%	15%	139,151
SAMN11165919	NA	BB1: 13 Prefrontal (BA10) (Pan paniscus, SAMN11165919)	39,871,132	61%	17%	162,137
SAMN11165920	NA	BA1: 13 Prefrontal (BA10) (Pan paniscus, SAMN11165920)	38,212,944	70%	26%	187,757
SAMN11165921	NA	BC10: 10 2ary Visual (BA18/19) (Pan paniscus, SAMN11165921)	40,896,792	62%	7%	118,536
SAMN11165922	NA	BB10: 10 2ary Visual (BA18/19) (Pan paniscus, SAMN11165922)	32,703,660	69%	25%	164,711
SAMN11165923	NA	BA10: 10 2ary Visual (BA18/19) (Pan paniscus, SAMN11165923)	37,210,998	71%	25%	175,227
SAMN11165924	NA	BC9: 9 1ary Visual (BA17) (Pan paniscus, SAMN11165924)	31,228,482	61%	9%	116,352
SAMN11165925	NA	BB9: 9 1ary Visual (BA17) (Pan paniscus, SAMN11165925)	27,141,614	66%	25%	162,009
SAMN11165926	NA	BA9: 9 1ary Visual (BA17) (Pan paniscus, SAMN11165926)	37,499,828	69%	24%	174,575
SAMN11165927	NA	BC8: 4 Cingulate Posterior (BA31) (Pan paniscus, SAMN11165927)	42,349,424	65%	10%	142,617
SAMN11165928	NA	BB8: 4 Cingulate Posterior (BA31) (Pan paniscus, SAMN11165928)	35,741,454	67%	26%	169,282
SAMN11165929	NA	BA8: 4 Cingulate Posterior (BA31) (Pan paniscus, SAMN11165929)	31,403,600	63%	21%	152,151
SAMN11165930	NA	BC7: 3 Precuneus (BA7) (Pan paniscus, SAMN11165930)	35,997,994	49%	7%	107,569
SAMN11165931	NA	BB7: 3 Precuneus (BA7) (Pan paniscus, SAMN11165931)	43,109,516	65%	24%	179,391
SAMN11165932	NA	BA7: 3 Precuneus (BA7) (Pan paniscus, SAMN11165932)	36,294,670	72%	25%	177,115
SAMN11165933	NA	BC6: 5 Premotor (BA6) (Pan paniscus, SAMN11165933)	30,570,898	51%	10%	117,396
SAMN11165934	NA	BB6: 5 Premotor (BA6) (Pan paniscus, SAMN11165934)	28,597,274	58%	20%	112,755
SAMN11165935	NA	BA6: 5 Premotor (BA6) (Pan paniscus, SAMN11165935)	36,040,088	73%	24%	177,082
SAMN11165936	NA	BC5: 11 Cingulate Anterior (BA24) (Pan paniscus, SAMN11165936)	37,250,598	68%	12%	144,011
SAMN11165937	NA	BB5: 11 Cingulate Anterior (BA24) (Pan paniscus, SAMN11165937)	31,598,080	69%	27%	162,494
SAMN11165938	NA	BA5: 11 Cingulate Anterior (BA24) (Pan paniscus, SAMN11165938)	32,314,680	67%	25%	172,584
SAMN11165939	NA	BC4: 12 Cingulate Anterior (BA32) (Pan paniscus, SAMN11165939)	34,885,490	48%	14%	123,550
SAMN11165940	NA	BC28: 1 Supramarginal (BA40) (Pan paniscus, SAMN11165940)	34,196,784	65%	11%	137,487
SAMN11165941	NA	BB28: 1 Supramarginal (BA40) (Pan paniscus, SAMN11165941)	37,275,136	72%	28%	178,103
SAMN11165942	NA	BC29: 8 1ary Motor (BA4) (Pan paniscus, SAMN11165942)	34,656,112	68%	13%	145,159
SAMN11165943	NA	BB29: 8 1ary Motor (BA4) (Pan paniscus, SAMN11165943)	37,804,724	63%	25%	172,228
SAMN11165944	NA	BA29: 8 1ary Motor (BA4) (Pan paniscus, SAMN11165944)	39,064,654	70%	24%	178,638
SAMN11165972	NA	BC41: 33 Nucleus Accumbens (Pan paniscus, SAMN11165972)	36,975,384	60%	12%	133,710
SAMN11165973	NA	BB41: 33 Nucleus Accumbens (Pan paniscus, SAMN11165973)	36,005,418	71%	30%	186,212
SAMN11165974	NA	BA41: 33 Nucleus Accumbens (Pan paniscus, SAMN11165974)	30,274,848	74%	26%	172,438
SAMN11165975	NA	BC40: 27 Globus Pallidus (Pan paniscus, SAMN11165975)	49,065,672	52%	12%	126,799
SAMN11165976	NA	BB40: 27 Globus Pallidus (Pan paniscus, SAMN11165976)	34,222,784	64%	26%	167,267
SAMN11165977	NA	BA40: 27 Globus Pallidus (Pan paniscus, SAMN11165977)	35,633,504	67%	22%	163,322
SAMN11165978	NA	BC39: 26 Internal Capsule (Pan paniscus, SAMN11165978)	42,235,262	55%	16%	146,557
SAMN11165979	NA	BB39: 26 Internal Capsule (Pan paniscus, SAMN11165979)	34,888,660	65%	29%	163,871
SAMN11165980	NA	BA39: 26 Internal Capsule (Pan paniscus, SAMN11165980)	36,833,390	64%	21%	149,203
SAMN11165981	NA	BC38: 31 Putamen (Pan paniscus, SAMN11165981)	36,234,122	54%	10%	124,972
SAMN11165982	NA	BB38: 31 Putamen (Pan paniscus, SAMN11165982)	39,513,378	65%	26%	174,557
SAMN11165983	NA	BA38: 31 Putamen (Pan paniscus, SAMN11165983)	34,102,552	71%	23%	166,552
SAMN11165984	NA	BC37: 32 Caudate (Pan paniscus, SAMN11165984)	33,006,208	57%	10%	128,750
SAMN11165985	NA	BB37: 32 Caudate (Pan paniscus, SAMN11165985)	29,011,398	70%	27%	175,767
SAMN11165986	NA	BA37: 32 Caudate (Pan paniscus, SAMN11165986)	32,276,942	70%	22%	163,578
SAMN11165987	NA	BC36: 20 Entorhinal Cortex (Pan paniscus, SAMN11165987)	42,395,038	51%	9%	115,960
SAMN11165988	NA	BB36: 20 Entorhinal Cortex (Pan paniscus, SAMN11165988)	35,502,796	70%	29%	171,493
SAMN11165989	NA	BA36: 20 Entorhinal Cortex (Pan paniscus, SAMN11165989)	34,034,130	70%	23%	166,222
SAMN11165990	NA	BC33: 21 Hippocampus (Pan paniscus, SAMN11165990)	36,319,576	51%	9%	115,006
SAMN11165991	NA	BB33: 21 Hippocampus (Pan paniscus, SAMN11165991)	31,855,884	72%	27%	180,801
SAMN11165992	NA	BA33: 21 Hippocampus (Pan paniscus, SAMN11165992)	38,928,648	67%	24%	174,086
SAMN11165993	NA	BC32: 15 Insular Posterior Cortex (Pan paniscus, SAMN11165993)	37,269,032	57%	9%	122,561
SAMN11165994	NA	BB32: 15 Insular Posterior Cortex (Pan paniscus, SAMN11165994)	39,232,794	67%	26%	171,730
SAMN11165995	NA	BA32: 15 Insular Posterior Cortex (Pan paniscus, SAMN11165995)	39,409,078	74%	25%	178,134
SAMN11165996	NA	BC31: 7 1ary Auditory (BA41/42) (Pan paniscus, SAMN11165996)	35,217,380	64%	11%	121,924
SAMN11165997	NA	BB31: 7 1ary Auditory (BA41/42) (Pan paniscus, SAMN11165997)	42,760,922	62%	23%	165,222
SAMN11165998	NA	BA31: 7 1ary Auditory (BA41/42) (Pan paniscus, SAMN11165998)	35,072,304	74%	26%	174,362
SAMN11165999	NA	BC30: 6 1ary Somatosensory (BA3/1/2) (Pan paniscus, SAMN11165999)	40,688,976	66%	12%	151,453
SAMN11166000	NA	BB30: 6 1ary Somatosensory (BA3/1/2) (Pan paniscus, SAMN11166000)	30,985,612	72%	29%	171,155
SAMN11166001	NA	BA30: 6 1ary Somatosensory (BA3/1/2) (Pan paniscus, SAMN11166001)	35,001,002	68%	23%	173,182
SAMN11166027	NA	BC43: 29 Substantia Nigra (Pan paniscus, SAMN11166027)	26,737,420	49%	7%	84,615
SAMN11166028	NA	BA43: 29 Substantia Nigra (Pan paniscus, SAMN11166028)	37,593,362	72%	25%	171,144
SAMN11166029	NA	BC42: 19 Amygdala (Pan paniscus, SAMN11166029)	27,847,582	56%	7%	95,519
SAMN11166030	NA	BB42: 19 Amygdala (Pan paniscus, SAMN11166030)	37,197,124	67%	27%	170,711
SAMN11166031	NA	BA42: 19 Amygdala (Pan paniscus, SAMN11166031)	37,622,512	68%	21%	168,060

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
ERR3473997	ERX3495639	ERP116771	SAMEA5858368	31,005,746	49%	26%
ERR3473998	ERX3495640	ERP116771	SAMEA5858369	30,117,640	59%	27%
ERR3473999	ERX3495641	ERP116771	SAMEA5858370	35,294,986	57%	26%
ERR3474000	ERX3495642	ERP116771	SAMEA5858371	33,354,856	55%	26%
ERR3474001	ERX3495643	ERP116771	SAMEA5858372	34,463,786	56%	28%
ERR3474002	ERX3495644	ERP116771	SAMEA5858373	33,140,374	55%	28%
ERR3474003	ERX3495645	ERP116771	SAMEA5858374	36,317,322	57%	28%
ERR3474004	ERX3495646	ERP116771	SAMEA5858375	29,489,318	57%	28%
ERR3474005	ERX3495647	ERP116771	SAMEA5858376	32,149,328	55%	28%
ERR3474006	ERX3495648	ERP116771	SAMEA5858377	29,288,332	58%	27%
SRR306826	SRX081970	SRP007412	SAMN00632220	34,332,540	74%	21%
SRR306827	SRX081971	SRP007412	SAMN00632221	24,777,783	74%	9%
SRR306828	SRX081972	SRP007412	SAMN00632222	38,196,822	70%	12%
SRR306829	SRX081973	SRP007412	SAMN00632223	30,345,120	69%	9%
SRR306830	SRX081974	SRP007412	SAMN00632224	34,467,310	60%	12%
SRR306831	SRX081975	SRP007412	SAMN00632225	29,650,645	69%	13%
SRR306832	SRX081976	SRP007412	SAMN00632226	26,025,889	76%	6%
SRR306833	SRX081977	SRP007412	SAMN00632227	30,139,364	69%	10%
SRR306834	SRX081978	SRP007412	SAMN00632228	25,901,079	76%	9%
SRR306835	SRX081979	SRP007412	SAMN00632229	28,491,592	71%	15%
SRR306836	SRX081980	SRP007412	SAMN00632230	20,161,205	65%	13%
SRR306837	SRX081981	SRP007412	SAMN00632231	16,151,902	59%	18%
SRR873626	SRX290735	SRP023550	SAMN02189949	30,297,954	89%	30%
SRR873627	SRX290736	SRP023550	SAMN02189950	36,053,276	89%	31%
SRR873628	SRX290737	SRP023550	SAMN02189951	145,284,828	90%	31%
SRR873629	SRX290738	SRP023550	SAMN02189952	27,397,776	86%	30%
SRR976175	SRX348495	SRP029888	SAMN02353876	9,747,304	78%	14%
SRR976174	SRX348494	SRP029888	SAMN02353898	8,495,653	74%	14%
SRR3222427	SRX1629202	SRP071663	SAMN04546399	87,082,582	31%	26%
SRR6190112	SRX3300139	SRP120495	SAMN07814255	20,944,429	89%	18%
SRR6190111	SRX3300138	SRP120495	SAMN07814256	26,127,878	90%	18%
SRR6190110	SRX3300137	SRP120495	SAMN07814257	20,528,640	90%	19%
SRR6190109	SRX3300136	SRP120495	SAMN07814258	20,027,099	91%	19%
SRR6190108	SRX3300135	SRP120495	SAMN07814259	19,296,079	91%	19%
SRR6190107	SRX3300134	SRP120495	SAMN07814260	16,178,247	90%	18%
SRR6190106	SRX3300133	SRP120495	SAMN07814261	19,638,029	90%	19%
SRR6190105	SRX3300132	SRP120495	SAMN07814262	18,277,423	90%	18%
SRR6190104	SRX3300131	SRP120495	SAMN07814263	21,454,733	90%	18%
SRR6190103	SRX3300130	SRP120495	SAMN07814264	20,573,560	90%	18%
SRR6190102	SRX3300129	SRP120495	SAMN07814265	22,522,956	90%	18%
SRR6190101	SRX3300128	SRP120495	SAMN07814266	17,930,444	87%	16%
SRR6190100	SRX3300127	SRP120495	SAMN07814267	21,552,968	88%	19%
SRR6190099	SRX3300126	SRP120495	SAMN07814268	23,188,217	85%	18%
SRR6190116	SRX3300143	SRP120495	SAMN07814444	31,217,802	92%	21%
SRR6190115	SRX3300142	SRP120495	SAMN07814445	20,288,657	90%	18%
SRR6190114	SRX3300141	SRP120495	SAMN07814446	22,854,851	90%	19%
SRR6190113	SRX3300140	SRP120495	SAMN07814447	26,374,761	90%	18%
SRR6190198	SRX3300225	SRP120495	SAMN07814475	4,483,731	60%	18%
SRR8750596	SRX5541530	SRP188822	SAMN11165805	38,621,270	71%	25%
SRR8750595	SRX5541529	SRP188822	SAMN11165806	31,021,648	55%	14%
SRR8750452	SRX5541528	SRP188822	SAMN11165807	33,122,852	63%	23%
SRR8750451	SRX5541527	SRP188822	SAMN11165808	36,865,746	69%	28%
SRR8750450	SRX5541526	SRP188822	SAMN11165809	32,704,030	56%	16%
SRR8750449	SRX5541525	SRP188822	SAMN11165810	28,975,830	56%	16%
SRR8750448	SRX5541524	SRP188822	SAMN11165811	36,108,548	68%	23%
SRR8750447	SRX5541523	SRP188822	SAMN11165812	37,921,460	59%	15%
SRR8750446	SRX5541522	SRP188822	SAMN11165813	30,626,882	54%	27%
SRR8750445	SRX5541521	SRP188822	SAMN11165814	42,344,838	67%	21%
SRR8750444	SRX5541520	SRP188822	SAMN11165815	31,020,684	57%	15%
SRR8750443	SRX5541519	SRP188822	SAMN11165816	39,380,074	71%	27%
SRR8750442	SRX5541518	SRP188822	SAMN11165817	33,048,394	68%	22%
SRR8750441	SRX5541517	SRP188822	SAMN11165818	34,309,100	54%	19%
SRR8750440	SRX5541516	SRP188822	SAMN11165819	42,593,130	54%	17%
SRR8750439	SRX5541515	SRP188822	SAMN11165820	27,547,936	53%	15%
SRR8750438	SRX5541514	SRP188822	SAMN11165821	36,029,820	51%	15%
SRR8750437	SRX5541513	SRP188822	SAMN11165822	32,282,048	61%	23%
SRR8750436	SRX5541512	SRP188822	SAMN11165823	48,459,588	72%	23%
SRR8750435	SRX5541511	SRP188822	SAMN11165824	38,775,898	64%	10%
SRR8750434	SRX5541510	SRP188822	SAMN11165825	27,476,696	67%	28%
SRR8750433	SRX5541509	SRP188822	SAMN11165826	29,131,670	65%	22%
SRR8750432	SRX5541508	SRP188822	SAMN11165827	31,929,280	63%	10%
SRR8750431	SRX5541507	SRP188822	SAMN11165828	32,996,960	68%	25%
SRR8750430	SRX5541506	SRP188822	SAMN11165829	94,272,626	67%	24%
SRR8750429	SRX5541505	SRP188822	SAMN11165830	34,294,306	53%	11%
SRR8750428	SRX5541504	SRP188822	SAMN11165831	36,007,136	61%	25%
SRR8750427	SRX5541503	SRP188822	SAMN11165832	34,931,774	69%	26%
SRR8750407	SRX5541333	SRP188822	SAMN11165910	35,445,860	63%	22%
SRR8750406	SRX5541332	SRP188822	SAMN11165911	35,457,004	72%	26%
SRR8750405	SRX5541331	SRP188822	SAMN11165912	34,525,428	67%	13%
SRR8750404	SRX5541330	SRP188822	SAMN11165913	32,092,890	67%	25%
SRR8750403	SRX5541329	SRP188822	SAMN11165914	33,829,448	72%	25%
SRR8750402	SRX5541328	SRP188822	SAMN11165915	39,231,050	54%	13%
SRR8750401	SRX5541327	SRP188822	SAMN11165916	28,394,546	62%	23%
SRR8750400	SRX5541326	SRP188822	SAMN11165917	39,336,866	69%	27%
SRR8750399	SRX5541325	SRP188822	SAMN11165918	29,974,768	66%	15%
SRR8750398	SRX5541324	SRP188822	SAMN11165919	39,871,132	61%	17%
SRR8750397	SRX5541323	SRP188822	SAMN11165920	38,212,944	70%	26%
SRR8750426	SRX5541502	SRP188822	SAMN11165921	40,896,792	62%	7%
SRR8750425	SRX5541501	SRP188822	SAMN11165922	32,703,660	69%	25%
SRR8750424	SRX5541500	SRP188822	SAMN11165923	37,210,998	71%	25%
SRR8750423	SRX5541499	SRP188822	SAMN11165924	31,228,482	61%	9%
SRR8750422	SRX5541498	SRP188822	SAMN11165925	27,141,614	66%	25%
SRR8750421	SRX5541497	SRP188822	SAMN11165926	37,499,828	69%	24%
SRR8750420	SRX5541496	SRP188822	SAMN11165927	42,349,424	65%	10%
SRR8750419	SRX5541495	SRP188822	SAMN11165928	35,741,454	67%	26%
SRR8750418	SRX5541494	SRP188822	SAMN11165929	31,403,600	63%	21%
SRR8750417	SRX5541493	SRP188822	SAMN11165930	35,997,994	49%	7%
SRR8750416	SRX5541492	SRP188822	SAMN11165931	43,109,516	65%	24%
SRR8750415	SRX5541491	SRP188822	SAMN11165932	36,294,670	72%	25%
SRR8750414	SRX5541490	SRP188822	SAMN11165933	30,570,898	51%	10%
SRR8750413	SRX5541489	SRP188822	SAMN11165934	28,597,274	58%	20%
SRR8750412	SRX5541338	SRP188822	SAMN11165935	36,040,088	73%	24%
SRR8750411	SRX5541337	SRP188822	SAMN11165936	37,250,598	68%	12%
SRR8750410	SRX5541336	SRP188822	SAMN11165937	31,598,080	69%	27%
SRR8750409	SRX5541335	SRP188822	SAMN11165938	32,314,680	67%	25%
SRR8750408	SRX5541334	SRP188822	SAMN11165939	34,885,490	48%	14%
SRR8750598	SRX5541532	SRP188822	SAMN11165940	34,196,784	65%	11%
SRR8750597	SRX5541531	SRP188822	SAMN11165941	37,275,136	72%	28%
SRR8750601	SRX5541535	SRP188822	SAMN11165942	34,656,112	68%	13%
SRR8750600	SRX5541534	SRP188822	SAMN11165943	37,804,724	63%	25%
SRR8750599	SRX5541533	SRP188822	SAMN11165944	39,064,654	70%	24%
SRR8750631	SRX5541565	SRP188822	SAMN11165972	36,975,384	60%	12%
SRR8750630	SRX5541564	SRP188822	SAMN11165973	36,005,418	71%	30%
SRR8750629	SRX5541563	SRP188822	SAMN11165974	30,274,848	74%	26%
SRR8750628	SRX5541562	SRP188822	SAMN11165975	49,065,672	52%	12%
SRR8750627	SRX5541561	SRP188822	SAMN11165976	34,222,784	64%	26%
SRR8750626	SRX5541560	SRP188822	SAMN11165977	35,633,504	67%	22%
SRR8750625	SRX5541559	SRP188822	SAMN11165978	42,235,262	55%	16%
SRR8750624	SRX5541558	SRP188822	SAMN11165979	34,888,660	65%	29%
SRR8750623	SRX5541557	SRP188822	SAMN11165980	36,833,390	64%	21%
SRR8750622	SRX5541556	SRP188822	SAMN11165981	36,234,122	54%	10%
SRR8750621	SRX5541555	SRP188822	SAMN11165982	39,513,378	65%	26%
SRR8750620	SRX5541554	SRP188822	SAMN11165983	34,102,552	71%	23%
SRR8750619	SRX5541553	SRP188822	SAMN11165984	33,006,208	57%	10%
SRR8750618	SRX5541552	SRP188822	SAMN11165985	29,011,398	70%	27%
SRR8750617	SRX5541551	SRP188822	SAMN11165986	32,276,942	70%	22%
SRR8750616	SRX5541550	SRP188822	SAMN11165987	42,395,038	51%	9%
SRR8750615	SRX5541549	SRP188822	SAMN11165988	35,502,796	70%	29%
SRR8750614	SRX5541548	SRP188822	SAMN11165989	34,034,130	70%	23%
SRR8750613	SRX5541547	SRP188822	SAMN11165990	36,319,576	51%	9%
SRR8750612	SRX5541546	SRP188822	SAMN11165991	31,855,884	72%	27%
SRR8750611	SRX5541545	SRP188822	SAMN11165992	38,928,648	67%	24%
SRR8750610	SRX5541544	SRP188822	SAMN11165993	37,269,032	57%	9%
SRR8750609	SRX5541543	SRP188822	SAMN11165994	39,232,794	67%	26%
SRR8750608	SRX5541542	SRP188822	SAMN11165995	39,409,078	74%	25%
SRR8750607	SRX5541541	SRP188822	SAMN11165996	35,217,380	64%	11%
SRR8750606	SRX5541540	SRP188822	SAMN11165997	42,760,922	62%	23%
SRR8750605	SRX5541539	SRP188822	SAMN11165998	35,072,304	74%	26%
SRR8750604	SRX5541538	SRP188822	SAMN11165999	40,688,976	66%	12%
SRR8750603	SRX5541537	SRP188822	SAMN11166000	30,985,612	72%	29%
SRR8750602	SRX5541536	SRP188822	SAMN11166001	35,001,002	68%	23%
SRR8750636	SRX5541570	SRP188822	SAMN11166027	26,737,420	49%	7%
SRR8750635	SRX5541569	SRP188822	SAMN11166028	37,593,362	72%	25%
SRR8750634	SRX5541568	SRP188822	SAMN11166029	27,847,582	56%	7%
SRR8750633	SRX5541567	SRP188822	SAMN11166030	37,197,124	67%	27%
SRR8750632	SRX5541566	SRP188822	SAMN11166031	37,622,512	68%	21%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Primates GenBank	21,436	14,905 (69.53%)	14,905 (69.53%)	80.29%	91.58%
Primates known RefSeq (NP_)	14,549	11,795 (81.07%)	11,795 (81.07%)	87.46%	93.58%
Same-species GenBank	80	46 (57.50%)	46 (57.50%)	82.03%	95.03%
Same-species known RefSeq (NP_)	49	37 (75.51%)	37 (75.51%)	86.73%	95.54%
Homo sapiens GenBank	144,553	83,595 (57.83%)	83,595 (57.83%)	81.74%	84.56%
Homo sapiens known RefSeq (NP_)	57,310	43,446 (75.81%)	43,446 (75.81%)	89.20%	92.02%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
Mhudiblu_PPA_v0 (Current) Coverage: 89.52%	Mhudiblu_PPA_v0 (Current) Coverage: 90.89%
panpan1.1 (Previous) Coverage: 98.99%	panpan1.1 (Previous) Coverage: 99.08%
Percent Identity: 99.69%	Percent Identity: 99.59%

Comparison of the current and previous annotations

The annotation produced for this release (104) was compared to the annotation in the previous release (103) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	Mhudiblu_PPA_v0 (Current) to panpan1.1 (Previous)
Identical	15%
Minor changes	47%
Major changes	18%
New	20%
Deprecated	9%
Other	1%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20

RefSeq

Integrated reference sequences