NCBI Corvus hawaiiensis Annotation Release 100

The RefSeq genome records for Corvus hawaiiensis were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
BUSCO results: Annotation completeness assessed with BUSCO
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Corvus hawaiiensis Annotation Release 100

Annotation release ID: 100
Date of Entrez queries for transcripts and proteins: May 17 2022
Date of submission of annotation to the public databases: May 19 2022
Software version: 9.0

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
bCorHaw1.pri.cur	GCF_020740725.1	Vertebrate Genomes Project	11-04-2021	Reference	44 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	bCorHaw1.pri.cur
Genes and pseudogenes	21,647
protein-coding	16,414
non-coding	5,035
Transcribed pseudogenes	0
Non-transcribed pseudogenes	145
genes with variants	9,315
Immunoglobulin/T-cell receptor gene segments	38
other	15
mRNAs	43,134
fully-supported	42,003
with > 5% ab initio	591
partial	180
with filled gap(s)	0
known RefSeq (NM_)	0
model RefSeq (XM_)	43,134
non-coding RNAs	8,731
fully-supported	8,088
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	8,459
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	43,185
fully-supported	42,003
with > 5% ab initio	676
partial	182
with major correction(s)	644
known RefSeq (NP_)	0
model RefSeq (XP_)	43,147

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	21,464	30,146	11,120	49	1,366,951
All transcripts	51,865	4,013	3,207	49	95,534
mRNA	43,134	4,264	3,455	189	95,534
misc_RNA	1,460	4,102	3,068	125	36,095
tRNA	270	74	73	64	87
lncRNA	6,631	2,732	1,547	78	26,846
snoRNA	185	110	96	49	323
snRNA	37	140	141	61	191
rRNA	133	461	119	119	4,296
Single-exon transcripts	523	1,846	1,365	273	17,382
coding transcripts (NM_/XM_ )	523	1,846	1,365	273	17,382
CDSs	43,147	2,332	1,656	96	94,350
Exons	229,947	351	135	1	26,553
in coding transcripts (NM_/XM_ )	209,904	319	134	1	26,553
in non-coding transcripts (NR_/XR_ )	28,324	544	146	2	22,062
Introns	207,023	3,938	942	30	700,122
in coding transcripts (NM_/XM_ )	192,566	3,808	917	30	700,122
in non-coding transcripts (NR_/XR_ )	22,488	4,924	1,246	30	353,252

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	2.43	1	1	50
Number of exons per transcript	13.43	10	1	257

BUSCO analysis of gene annotation

BUSCO v4.1.4 was run in "protein" mode on the annotated gene set picking one longest protein per gene, and run using the passeriformes_odb10 lineage dataset. Results are reported for the gene set from the primary assembly unit, and presented in BUSCO notation.

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 16401 coding genes, 15747 genes had a protein with an alignment covering 50% or more of the query and 10748 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker (if calculated), for each assembly. RepeatMasker results are only calculated for organisms with complete Dfam HMM model collections.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with WindowMasker
bCorHaw1.pri.cur	GCF_020740725.1	22.76%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign, minimap2, or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Transcript alignments

No transcript evidence was used in this annotation

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Publication	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	NA	Aggregate of all aligned samples	2,847,991,870	86%	21%	270,884
SAMN00002706	NA	Brain transcriptome from Corvus corone corone (Corvus corone corone, SAMN00002706)	213,818	64%	24%	17,100
SAMN00002707	NA	Brain transcriptome from Corvus cornix (Corvus cornix, SAMN00002707)	173,207	62%	23%	14,980
SAMN00004737	NA	Generic sample from Corvus brachyrhynchos (Corvus brachyrhynchos, SAMN00004737)	368,661	73%	9%	10,334
SAMN02194710	24948738,28360231	forebrain (Corvus cornix cornix, male, SAMN02194710)	44,976,482	88%	19%	174,338
SAMN02194711	24948738,28360231	forebrain (Corvus cornix cornix, male, SAMN02194711)	26,560,656	81%	19%	159,292
SAMN02194712	24948738,28360231	forebrain (Corvus cornix cornix, male, SAMN02194712)	38,662,502	88%	18%	171,443
SAMN02194713	24948738,28360231	forebrain (Corvus cornix cornix, male, SAMN02194713)	67,965,152	88%	19%	186,817
SAMN02195187	24948738,28360231	forebrain (Corvus corone corone, male, SAMN02195187)	50,648,394	87%	19%	178,587
SAMN02195191	24948738,28360231	forebrain (Corvus corone corone, male, SAMN02195191)	67,293,992	87%	18%	185,274
SAMN02997663	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997663)	82,745,058	89%	24%	186,637
SAMN02997664	24948738,28360231	liver (Corvus cornix cornix, male, SAMN02997664)	29,668,710	85%	30%	136,156
SAMN02997665	24948738,28360231	brain (Corvus cornix cornix, male, SAMN02997665)	37,928,428	88%	18%	169,778
SAMN02997666	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997666)	34,618,288	89%	24%	164,136
SAMN02997667	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997667)	23,405,004	88%	23%	152,665
SAMN02997668	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997668)	66,627,324	89%	23%	184,116
SAMN02997693	24948738,28360231	gonads (Corvus cornix cornix, male, SAMN02997693)	38,271,060	87%	21%	184,739
SAMN02997698	24948738,28360231	gonads (Corvus cornix cornix, male, SAMN02997698)	38,147,840	87%	21%	182,414
SAMN02997708	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997708)	32,823,194	83%	24%	164,682
SAMN02997723	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997723)	31,249,158	83%	23%	162,088
SAMN02997737	24948738,28360231	liver (Corvus cornix cornix, male, SAMN02997737)	40,598,094	87%	32%	140,669
SAMN02997741	24948738,28360231	heart (Corvus cornix cornix, male, SAMN02997741)	42,938,150	48%	20%	142,225
SAMN02997743	24948738,28360231	Eye (Corvus corone corone, male, SAMN02997743)	29,788,944	83%	22%	176,929
SAMN02997744	24948738,28360231	heart (Corvus corone corone, male, SAMN02997744)	22,671,816	61%	25%	132,362
SAMN02997745	24948738,28360231	spleen (Corvus corone corone, male, SAMN02997745)	40,774,640	81%	23%	164,568
SAMN02997746	24948738,28360231	skin (Corvus corone corone, male, SAMN02997746)	17,085,134	83%	24%	149,825
SAMN02997747	24948738,28360231	eye (Corvus cornix cornix, male, SAMN02997747)	53,471,644	82%	23%	170,170
SAMN02997748	24948738,28360231	spleen (Corvus cornix cornix, male, SAMN02997748)	17,048,880	83%	22%	163,151
SAMN02997749	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997749)	33,030,418	83%	24%	165,196
SAMN02997750	24948738,28360231	skin (Corvus cornix cornix, male, SAMN02997750)	44,784,148	83%	24%	172,739
SAMN02997751	24948738,28360231	eye (Corvus cornix cornix, male, SAMN02997751)	60,009,926	86%	20%	186,079
SAMN03890325	24948738,28360231	brain hypothalamus and pituitary (Corvus corone corone, male, SAMN03890325)	35,959,772	85%	20%	173,452
SAMN03890326	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890326)	32,417,990	90%	20%	167,158
SAMN03890327	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890327)	31,768,218	87%	20%	173,074
SAMN03890328	24948738,28360231	brain hypothalamus and pituitary (Corvus corone corone, male, SAMN03890328)	33,519,514	84%	19%	154,585
SAMN03890329	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890329)	36,435,836	88%	22%	169,271
SAMN03890330	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890330)	29,689,722	89%	20%	165,000
SAMN03890331	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890331)	32,715,524	85%	20%	156,352
SAMN03890332	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890332)	33,322,464	88%	20%	168,300
SAMN03890333	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890333)	24,334,816	89%	20%	158,812
SAMN03890334	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890334)	29,691,992	90%	20%	168,484
SAMN03890335	24948738,28360231	skin (head pool crown) (Corvus cornix cornix, male, SAMN03890335)	41,153,312	90%	25%	171,633
SAMN03890336	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890336)	35,173,230	86%	23%	172,785
SAMN03890337	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890337)	30,664,584	89%	21%	161,937
SAMN03890338	24948738,28360231	skin (head pool throat) (Corvus cornix cornix, male, SAMN03890338)	38,599,564	90%	24%	168,698
SAMN03890339	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890339)	34,173,126	89%	20%	168,371
SAMN03890340	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890340)	33,325,108	84%	19%	162,785
SAMN03890341	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890341)	29,794,690	88%	21%	156,711
SAMN03890342	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890342)	75,228,684	88%	19%	194,780
SAMN03890343	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890343)	36,139,028	90%	20%	170,013
SAMN03890344	24948738,28360231	brain hypothalamus and pituitary (Corvus corone corone, male, SAMN03890344)	36,897,772	86%	20%	172,751
SAMN03890345	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890345)	33,731,420	85%	21%	162,530
SAMN03890346	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890346)	30,657,642	88%	22%	171,174
SAMN03890347	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890347)	33,458,532	89%	20%	169,503
SAMN03890348	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890348)	33,686,372	88%	20%	167,425
SAMN03890349	24948738,28360231	Eye (Corvus corone corone, male, SAMN03890349)	27,555,254	89%	21%	164,451
SAMN03890350	24948738,28360231	brain (hypothalamus and pituitary) (Corvus cornix cornix, male, SAMN03890350)	39,745,574	83%	7%	142,726
SAMN03890351	24948738,28360231	brain hypothalamus and pituitary (Corvus corone corone, male, SAMN03890351)	31,803,378	78%	23%	154,962
SAMN03890353	24948738,28360231	eye (Corvus cornix cornix, male, SAMN03890353)	31,213,096	90%	20%	166,203
SAMN13045714	NA	Brain (Corvus woodfordi, SAMN13045714)	64,793,632	87%	19%	168,023
SAMN13045715	NA	Brain (Corvus woodfordi, SAMN13045715)	57,949,306	82%	16%	160,750
SAMN13045716	NA	Brain (Corvus woodfordi, SAMN13045716)	62,665,644	90%	19%	170,720
SAMN13045717	NA	Brain (Corvus woodfordi, SAMN13045717)	68,507,602	89%	19%	173,078
SAMN13045718	NA	Brain (Corvus woodfordi, SAMN13045718)	113,402,828	89%	19%	181,403
SAMN13045719	NA	Brain (Corvus woodfordi, SAMN13045719)	61,157,854	88%	20%	176,529
SAMN13045720	NA	Liver (Corvus woodfordi, SAMN13045720)	59,765,668	85%	26%	131,289
SAMN13429824	NA	Pooled Tissue (Corvus splendens, SAMN13429824)	119,559,034	83%	31%	196,783
SAMN18673286	NA	Lung (Corvus kubaryi, Fledgling, female, SAMN18673286)	37,776,080	90%	19%	133,698
SAMN18673287	NA	Lung (Corvus kubaryi, Fledgling, female, SAMN18673287)	26,528,576	91%	23%	150,063
SAMN18673288	NA	Lung (Corvus kubaryi, Fledgling, male, SAMN18673288)	29,611,998	90%	22%	152,631
SAMN18673289	NA	Lung (Corvus kubaryi, Fledgling, male, SAMN18673289)	33,483,436	92%	25%	153,154
SAMN18673290	NA	Lung (Corvus kubaryi, Fledgling, SAMN18673290)	25,385,276	90%	22%	144,431

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
SRR019143	SRX006755	SRP000770	SAMN00002706	213,818	64%	24%
SRR019144	SRX006756	SRP000770	SAMN00002707	173,207	62%	23%
SRR029463	SRX012416	SRP001362	SAMN00004737	176,634	73%	9%
SRR029464	SRX012416	SRP001362	SAMN00004737	192,027	74%	9%
SRR946432	SRX305877	SRP022901	SAMN02194710	15,288,546	86%	19%
SRR946450	SRX305877	SRP022901	SAMN02194710	29,687,936	88%	18%
SRR947870	SRX330887	SRP022901	SAMN02194711	26,560,656	81%	19%
SRR2107373	SRX1101170	SRP022901	SAMN02194712	38,662,502	88%	18%
SRR947851	SRX330889	SRP022901	SAMN02194713	22,833,192	88%	19%
SRR947852	SRX330889	SRP022901	SAMN02194713	45,131,960	88%	19%
SRR947861	SRX330899	SRP022901	SAMN02195187	33,115,964	88%	19%
SRR947871	SRX330899	SRP022901	SAMN02195187	17,532,430	87%	19%
SRR947793	SRX330903	SRP022901	SAMN02195191	38,966,782	87%	18%
SRR947865	SRX330903	SRP022901	SAMN02195191	28,327,210	87%	19%
SRR1947479	SRX974044	SRP022901	SAMN02997663	38,744,488	88%	23%
SRR1947480	SRX974045	SRP022901	SAMN02997663	44,000,570	90%	24%
SRR1947478	SRX974043	SRP022901	SAMN02997664	29,668,710	85%	30%
SRR1947476	SRX974041	SRP022901	SAMN02997665	37,928,428	88%	18%
SRR1947475	SRX974040	SRP022901	SAMN02997666	34,618,288	89%	24%
SRR1947474	SRX974039	SRP022901	SAMN02997667	23,405,004	88%	23%
SRR1947472	SRX974037	SRP022901	SAMN02997668	29,693,918	88%	23%
SRR1947473	SRX974038	SRP022901	SAMN02997668	36,933,406	89%	23%
SRR1947446	SRX974011	SRP022901	SAMN02997693	38,271,060	87%	21%
SRR1947440	SRX974005	SRP022901	SAMN02997698	38,147,840	87%	21%
SRR1947429	SRX973994	SRP022901	SAMN02997708	32,823,194	83%	24%
SRR1947413	SRX973978	SRP022901	SAMN02997723	31,249,158	83%	23%
SRR1947400	SRX973965	SRP022901	SAMN02997737	40,598,094	87%	32%
SRR1947395	SRX973960	SRP022901	SAMN02997741	42,938,150	48%	20%
SRR1947392	SRX973957	SRP022901	SAMN02997743	29,788,944	83%	22%
SRR1947391	SRX973956	SRP022901	SAMN02997744	22,671,816	61%	25%
SRR1947390	SRX973955	SRP022901	SAMN02997745	40,774,640	81%	23%
SRR1947389	SRX973954	SRP022901	SAMN02997746	17,085,134	83%	24%
SRR1947388	SRX973953	SRP022901	SAMN02997747	53,471,644	82%	23%
SRR1947387	SRX973952	SRP022901	SAMN02997748	17,048,880	83%	22%
SRR1947385	SRX973950	SRP022901	SAMN02997749	33,030,418	83%	24%
SRR1947383	SRX973948	SRP022901	SAMN02997750	44,784,148	83%	24%
SRR2107327	SRX1101085	SRP022901	SAMN02997751	35,128,714	89%	20%
SRR1928171	SRX967626	SRP022901	SAMN02997751	24,881,212	83%	20%
SRR2106734	SRX1100637	SRP022901	SAMN03890325	35,959,772	85%	20%
SRR2106742	SRX1100645	SRP022901	SAMN03890326	32,417,990	90%	20%
SRR2106747	SRX1100646	SRP022901	SAMN03890327	31,768,218	87%	20%
SRR2106753	SRX1100647	SRP022901	SAMN03890328	33,519,514	84%	19%
SRR2106809	SRX1100649	SRP022901	SAMN03890329	36,435,836	88%	22%
SRR2106782	SRX1100650	SRP022901	SAMN03890330	29,689,722	89%	20%
SRR2106792	SRX1100651	SRP022901	SAMN03890331	32,715,524	85%	20%
SRR2106802	SRX1100652	SRP022901	SAMN03890332	33,322,464	88%	20%
SRR2106828	SRX1100653	SRP022901	SAMN03890333	24,334,816	89%	20%
SRR2106836	SRX1100654	SRP022901	SAMN03890334	29,691,992	90%	20%
SRR2106849	SRX1100655	SRP022901	SAMN03890335	41,153,312	90%	25%
SRR2106888	SRX1100657	SRP022901	SAMN03890336	35,173,230	86%	23%
SRR2106913	SRX1100674	SRP022901	SAMN03890337	30,664,584	89%	21%
SRR2106916	SRX1100675	SRP022901	SAMN03890338	38,599,564	90%	24%
SRR2106917	SRX1100676	SRP022901	SAMN03890339	34,173,126	89%	20%
SRR2106918	SRX1100677	SRP022901	SAMN03890340	33,325,108	84%	19%
SRR2107130	SRX1100881	SRP022901	SAMN03890341	29,794,690	88%	21%
SRR2106863	SRX1100656	SRP022901	SAMN03890342	39,562,498	88%	19%
SRR2107220	SRX1100943	SRP022901	SAMN03890342	35,666,186	89%	20%
SRR2107326	SRX1101078	SRP022901	SAMN03890343	36,139,028	90%	20%
SRR2107328	SRX1101092	SRP022901	SAMN03890344	36,897,772	86%	20%
SRR2107331	SRX1101098	SRP022901	SAMN03890345	33,731,420	85%	21%
SRR2107332	SRX1101102	SRP022901	SAMN03890346	30,657,642	88%	22%
SRR2107333	SRX1101106	SRP022901	SAMN03890347	33,458,532	89%	20%
SRR2107334	SRX1101112	SRP022901	SAMN03890348	33,686,372	88%	20%
SRR2107343	SRX1101125	SRP022901	SAMN03890349	27,555,254	89%	21%
SRR2107362	SRX1101143	SRP022901	SAMN03890350	39,745,574	83%	7%
SRR2107364	SRX1101152	SRP022901	SAMN03890351	31,803,378	78%	23%
SRR2107367	SRX1101160	SRP022901	SAMN03890353	31,213,096	90%	20%
SRR10303799	SRX7016255	SRP226123	SAMN13045714	42,075,356	87%	19%
SRR10303788	SRX7016266	SRP226123	SAMN13045714	22,718,276	87%	19%
SRR10303795	SRX7016259	SRP226123	SAMN13045715	20,498,016	82%	16%
SRR10303790	SRX7016264	SRP226123	SAMN13045715	37,451,290	82%	16%
SRR10303798	SRX7016256	SRP226123	SAMN13045716	40,734,822	90%	19%
SRR10303787	SRX7016267	SRP226123	SAMN13045716	21,930,822	90%	19%
SRR10303793	SRX7016261	SRP226123	SAMN13045717	44,488,438	89%	19%
SRR10303786	SRX7016268	SRP226123	SAMN13045717	24,019,164	89%	19%
SRR10303794	SRX7016260	SRP226123	SAMN13045718	42,520,944	89%	19%
SRR10303789	SRX7016265	SRP226123	SAMN13045718	70,881,884	89%	19%
SRR10303797	SRX7016257	SRP226123	SAMN13045719	21,527,574	88%	20%
SRR10303792	SRX7016262	SRP226123	SAMN13045719	39,630,280	88%	20%
SRR10303796	SRX7016258	SRP226123	SAMN13045720	20,988,012	85%	26%
SRR10303791	SRX7016263	SRP226123	SAMN13045720	38,777,656	85%	26%
SRR10560921	SRX7242543	SRP233405	SAMN13429824	119,559,034	83%	31%
SRR14671872	SRX11010351	SRP321711	SAMN18673286	37,776,080	90%	19%
SRR14671871	SRX11010352	SRP321711	SAMN18673287	26,528,576	91%	23%
SRR14671870	SRX11010353	SRP321711	SAMN18673288	29,611,998	90%	22%
SRR14671869	SRX11010354	SRP321711	SAMN18673289	33,483,436	92%	25%
SRR14671873	SRX11010350	SRP321711	SAMN18673290	25,385,276	90%	22%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Xenopus known RefSeq (NP_)	19,219	17,796 (92.60%)	17,796 (92.60%)	69.92%	78.99%
Aves GenBank	15,256	14,325 (93.90%)	14,325 (93.90%)	71.92%	83.48%
Aves known RefSeq (NP_)	9,966	9,761 (97.94%)	9,761 (97.94%)	77.40%	85.39%
Columba livia high-quality model RefSeq (XP_)	8,292	8,178 (98.63%)	8,178 (98.63%)	77.64%	85.74%
Gallus gallus high-quality model RefSeq (XP_)	9,986	9,632 (96.46%)	9,632 (96.46%)	76.50%	83.17%
Parus major high-quality model RefSeq (XP_)	11,979	11,904 (99.37%)	11,904 (99.37%)	80.57%	86.67%
Homo sapiens known RefSeq (NP_)	64,209	55,672 (86.70%)	55,672 (86.70%)	71.19%	76.58%

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
BUSCO: Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. Molecular biology and evolution 2021.38(10):4647-4654
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20
Minimap2: Li H. Bioinformatics 2018 Sep 15;34(18):3094-3100

RefSeq

Integrated reference sequences