NCBI Tachysurus fulvidraco Annotation Release 101

The RefSeq genome records for Tachysurus fulvidraco were annotated by the NCBI Eukaryotic Genome Annotation Pipeline, an automated pipeline that annotates genes, transcripts and proteins on draft and finished genome assemblies. This report presents statistics on the annotation products, the input data used in the pipeline and intermediate alignment results.

The annotation products are available in the sequence databases and on the FTP site.

This report provides:

Annotation Release information: The name of the release, important dates, the software version
Assemblies: A brief description of the annotated assembly(ies)
Gene and feature statistics: The counts and characteristics of the annotated features
BUSCO results: Annotation completeness assessed with BUSCO
Alignment of the annotated proteins to a set of high-quality proteins: The number of annotated proteins with hits to a set of high-quality proteins
Masking of genomic sequence: How much of the genome was masked
Transcript and protein alignments: The number and type of evidence retrieved from public databases and used for gene prediction
Similarity of current and previous assembly: The similarity of the current and previous assembly
Comparison of the current and previous annotations: What proportion of the genes changed in this annotation

For more information on the annotation process, please visit the NCBI Eukaryotic Genome Annotation Pipeline page.

Annotation Release information

This annotation should be referred to as NCBI Tachysurus fulvidraco Annotation Release 101

Annotation release ID: 101
Date of Entrez queries for transcripts and proteins: Apr 14 2022
Date of submission of annotation to the public databases: Apr 20 2022
Software version: 9.0

Assemblies

The following assemblies were included in this annotation run:

Assembly name	Assembly accession	Submitter	Assembly date	Reference/Alternate	Assembly content
HZAU_PFXX_2.0	GCF_022655615.1	Huazhong Agricultural University	03-21-2022	Reference	26 assembled chromosomes; unplaced scaffolds

Gene and feature statistics

Counts and length of annotated features are provided below for each assembly.

Feature counts

Feature	HZAU_PFXX_2.0
Genes and pseudogenes	33,922
protein-coding	23,998
non-coding	7,725
Transcribed pseudogenes	0
Non-transcribed pseudogenes	1,785
genes with variants	12,401
Immunoglobulin/T-cell receptor gene segments	405
other	9
mRNAs	55,521
fully-supported	54,571
with > 5% ab initio	374
partial	115
with filled gap(s)	3
known RefSeq (NM_)	0
model RefSeq (XM_)	55,521
non-coding RNAs	10,712
fully-supported	6,188
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	9,298
pseudo transcripts	0
fully-supported	0
with > 5% ab initio	0
partial	0
with filled gap(s)	0
known RefSeq (NR_)	0
model RefSeq (XR_)	0
CDSs	55,926
fully-supported	54,571
with > 5% ab initio	439
partial	134
with major correction(s)	182
known RefSeq (NP_)	0
model RefSeq (XP_)	55,521

Detailed reports

The counts below do not include pseudogenes.

Feature lengths

Feature	Count	Mean length (bp)	Median length (bp)	Min length (bp)	Max length (bp)
Genes	31,732	15,146	6,237	56	1,781,460
All transcripts	66,233	3,245	2,569	56	99,464
mRNA	55,521	3,650	2,910	231	99,464
misc_RNA	2,103	2,885	2,499	159	14,596
tRNA	1,414	75	73	68	87
lncRNA	4,093	1,392	1,015	73	13,248
snoRNA	206	130	128	62	321
snRNA	295	143	156	56	192
rRNA	2,592	125	119	118	4,325
Single-exon transcripts	868	1,693	1,294	240	13,989
coding transcripts (NM_/XM_ )	868	1,693	1,294	240	13,989
CDSs	55,521	2,284	1,566	99	98,208
Exons	306,987	291	141	2	21,511
in coding transcripts (NM_/XM_ )	292,362	284	140	2	21,511
in non-coding transcripts (NR_/XR_ )	28,672	308	142	2	9,068
Introns	273,339	1,899	448	30	990,799
in coding transcripts (NM_/XM_ )	263,552	1,876	445	30	990,799
in non-coding transcripts (NR_/XR_ )	23,537	2,041	484	30	580,409

Transcripts per gene, exons per transcript

	Mean	Median	Min	Max
Number of transcripts per gene	2.14	1	1	50
Number of exons per transcript	13.03	9	1	237

BUSCO analysis of gene annotation

BUSCO v4.1.4 was run in "protein" mode on the annotated gene set picking one longest protein per gene, and run using the actinopterygii_odb10 lineage dataset. Results are reported for the gene set from the primary assembly unit, and presented in BUSCO notation.

Alignment of the annotated proteins to a set of high-quality proteins

The final set of annotated proteins was searched with BLASTP against the UniProtKB/Swiss-Prot curated proteins, using the annotated proteins as the query and the high-quality proteins as the target. Out of 23998 coding genes, 21952 genes had a protein with an alignment covering 50% or more of the query and 11381 had an alignment covering 95% or more of the query.

Definition of query and target coverage. The query coverage is the percentage of the annotated protein length that is included in the alignment. The target coverage is the percentage of the target length that is included in the alignment.

Below is a cumulative graph displaying the number of genes with alignments above a given query or target coverage threshold. For comparison, corresponding statistics for other organisms annotated by the NCBI eukaryotic annotation pipeline were added to the graph.

Query: annotated proteins
Target: UniProtKB/Swiss-Prot curated proteins

Masking of genomic sequence

Transcript and protein alignments are performed on the repeat-masked genome. Below are the percentages of genomic sequence masked by WindowMasker and RepeatMasker (if calculated), for each assembly. RepeatMasker results are only calculated for organisms with complete Dfam HMM model collections.

For this annotation run, transcripts and proteins were aligned to the genome masked with WindowMasker only.

Assembly name	Assembly accession	% Masked with WindowMasker
HZAU_PFXX_2.0	GCF_022655615.1	34.26%

Transcript and protein alignments

The annotation pipeline relies heavily on alignments of experimental evidence for gene prediction. Below are the sets of transcripts and proteins that were retrieved from Entrez, aligned to the genome by Splign, minimap2, or ProSplign and passed to Gnomon, NCBI's gene prediction software.

Transcript alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by Splign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species Genbank	571	569 (99.65%)	545 (95.45%)	99.46%	98.23%
Same-species EST	127	123 (96.85%)	113 (88.98%)	99.54%	99.76%

RNA-Seq alignments

The following RNA-Seq reads from the Sequence Read Archive were also used for gene prediction:

Hide alignments statistics, by sample (SAME, SAMN, SAMD, DRS)

Sample Id	Track name	Number of reads	Percent aligned reads	Percent of aligned reads with introns	Number of introns
All	Aggregate of all aligned samples	7,555,674,389	86%	50%	337,315
SAMD00406531	skin (Tachysurus fulvidraco, SAMD00406531)	33,687,020	90%	63%	206,279
SAMD00406532	skin (Tachysurus fulvidraco, SAMD00406532)	46,452,802	90%	63%	214,621
SAMD00406533	skin (Tachysurus fulvidraco, SAMD00406533)	45,929,958	89%	62%	206,779
SAMD00406534	skin (Tachysurus fulvidraco, SAMD00406534)	47,822,402	77%	58%	209,324
SAMD00406535	skin (Tachysurus fulvidraco, SAMD00406535)	54,775,310	90%	61%	214,566
SAMD00406536	skin (Tachysurus fulvidraco, SAMD00406536)	70,168,008	90%	61%	217,967
SAMN02384151	General Sample for Pelteobagrus fulvidraco (Tachysurus fulvidraco, 4 years old, SAMN02384151)	1,202,933	76%	41%	164,714
SAMN03142394	skin, liver, spleen, kidney, head-kidney, and muscle (Tachysurus fulvidraco, years, SAMN03142394)	86,047,170	88%	41%	221,081
SAMN03142395	anal fin (Tachysurus fulvidraco, years, SAMN03142395)	228,068,567	87%	35%	250,358
SAMN03497639	venom gland (Tachysurus fulvidraco, SAMN03497639)	44,243,514	77%	24%	191,176
SAMN04517535	testis (Tachysurus fulvidraco, male, SAMN04517535)	87,268,036	87%	39%	261,794
SAMN06765420	spleen of the normal fish (Tachysurus fulvidraco, two-year-old, pooled male and female, SAMN06765420)	25,485,026	86%	42%	207,195
SAMN06765421	spleen of the fish infected with E. ictaluri (Tachysurus fulvidraco, two-year-old, pooled male and female, SAMN06765421)	25,700,595	87%	44%	206,823
SAMN10790437	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790437)	45,566,650	88%	59%	201,488
SAMN10790438	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790438)	48,509,818	87%	57%	215,838
SAMN10790439	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790439)	50,642,386	87%	57%	208,680
SAMN10790440	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790440)	45,047,268	87%	57%	209,630
SAMN10790441	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790441)	33,789,114	87%	56%	197,263
SAMN10790442	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790442)	50,415,682	87%	56%	204,522
SAMN10790443	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790443)	56,646,574	87%	55%	207,085
SAMN10790444	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790444)	49,652,854	87%	55%	207,456
SAMN10790445	sperm (Tachysurus fulvidraco, 2 years, male, SAMN10790445)	43,650,726	85%	53%	187,971
SAMN12138640	liver (Tachysurus fulvidraco, 1 year, male, SAMN12138640)	78,071,786	87%	48%	187,708
SAMN12138641	liver (Tachysurus fulvidraco, 1 year, male, SAMN12138641)	65,426,346	86%	47%	194,306
SAMN12905434	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905434)	67,221,512	88%	51%	213,304
SAMN12905436	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905436)	60,918,976	88%	51%	193,389
SAMN12905438	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905438)	57,812,956	89%	55%	204,300
SAMN12905440	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905440)	53,341,062	88%	53%	200,382
SAMN12905442	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905442)	52,391,528	86%	49%	189,552
SAMN12905444	intestine (Tachysurus fulvidraco, pooled male and female, SAMN12905444)	64,949,770	89%	54%	206,241
SAMN16830887	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830887)	60,386,892	94%	56%	211,509
SAMN16830888	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830888)	63,608,634	94%	56%	217,711
SAMN16830889	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830889)	65,439,092	94%	55%	218,345
SAMN16830890	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830890)	66,953,676	88%	49%	237,021
SAMN16830891	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830891)	56,855,198	87%	47%	229,968
SAMN16830892	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830892)	65,476,654	86%	49%	232,733
SAMN16830893	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830893)	65,859,346	93%	54%	233,191
SAMN16830894	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830894)	66,059,828	94%	55%	220,460
SAMN16830895	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830895)	58,008,810	87%	49%	229,080
SAMN16830896	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830896)	64,412,334	87%	49%	227,359
SAMN16830897	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830897)	53,868,720	74%	53%	221,828
SAMN16830898	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830898)	59,756,950	87%	48%	225,282
SAMN16830899	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830899)	65,815,682	92%	54%	235,859
SAMN16830900	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830900)	69,950,122	92%	55%	239,741
SAMN16830901	gonad (Tachysurus fulvidraco, 65 DPF, female, SAMN16830901)	65,551,886	94%	56%	222,766
SAMN16830902	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830902)	49,771,980	73%	52%	227,725
SAMN16830903	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830903)	64,737,982	74%	51%	227,888
SAMN16830904	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830904)	59,062,464	75%	54%	224,387
SAMN16830905	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830905)	60,001,798	80%	52%	234,145
SAMN16830906	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830906)	71,990,608	89%	50%	229,259
SAMN16830907	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830907)	63,642,792	88%	51%	231,445
SAMN16830908	gonad (Tachysurus fulvidraco, 65 DPF, male, SAMN16830908)	61,781,736	89%	51%	234,122
SAMN17255927	Liver (Tachysurus fulvidraco, one year, SAMN17255927)	54,936,058	90%	54%	154,438
SAMN17255928	Liver (Tachysurus fulvidraco, one year, SAMN17255928)	43,228,912	90%	55%	146,242
SAMN17255929	Liver (Tachysurus fulvidraco, one year, SAMN17255929)	46,031,806	91%	53%	148,453
SAMN17255930	Liver (Tachysurus fulvidraco, one year, SAMN17255930)	52,607,134	89%	54%	150,549
SAMN17255931	Liver (Tachysurus fulvidraco, one year, SAMN17255931)	55,406,810	88%	52%	146,410
SAMN17255932	Liver (Tachysurus fulvidraco, one year, SAMN17255932)	75,135,220	89%	52%	159,898
SAMN17255933	Liver (Tachysurus fulvidraco, one year, SAMN17255933)	41,583,944	90%	53%	137,424
SAMN17255934	Liver (Tachysurus fulvidraco, one year, SAMN17255934)	57,448,692	90%	52%	144,791
SAMN17255935	Liver (Tachysurus fulvidraco, one year, SAMN17255935)	52,921,372	91%	57%	151,576
SAMN17255936	Liver (Tachysurus fulvidraco, one year, SAMN17255936)	55,483,338	90%	56%	148,336
SAMN17255937	Liver (Tachysurus fulvidraco, one year, SAMN17255937)	48,944,908	89%	56%	150,992
SAMN17255938	Liver (Tachysurus fulvidraco, one year, SAMN17255938)	49,276,672	89%	57%	147,539
SAMN17255939	Liver (Tachysurus fulvidraco, one year, SAMN17255939)	41,039,960	91%	60%	136,521
SAMN17255940	Liver (Tachysurus fulvidraco, one year, SAMN17255940)	58,398,948	90%	59%	149,122
SAMN17255941	Liver (Tachysurus fulvidraco, one year, SAMN17255941)	57,317,136	91%	60%	146,346
SAMN17255942	Liver (Tachysurus fulvidraco, one year, SAMN17255942)	59,029,490	91%	61%	153,428
SAMN17255943	Liver (Tachysurus fulvidraco, one year, SAMN17255943)	56,526,354	91%	61%	151,671
SAMN17255944	Liver (Tachysurus fulvidraco, one year, SAMN17255944)	68,236,198	90%	60%	150,952
SAMN17255945	Liver (Tachysurus fulvidraco, one year, SAMN17255945)	58,687,178	92%	61%	153,390
SAMN17255946	Liver (Tachysurus fulvidraco, one year, SAMN17255946)	50,884,676	92%	61%	147,943
SAMN17255947	Liver (Tachysurus fulvidraco, one year, SAMN17255947)	51,765,446	92%	60%	137,674
SAMN17255948	Liver (Tachysurus fulvidraco, one year, SAMN17255948)	52,949,400	91%	59%	145,903
SAMN17255949	Liver (Tachysurus fulvidraco, one year, SAMN17255949)	52,490,710	91%	61%	155,463
SAMN17255950	Liver (Tachysurus fulvidraco, one year, SAMN17255950)	52,183,800	91%	60%	153,920
SAMN17255951	Spleen (Tachysurus fulvidraco, one year, SAMN17255951)	45,380,016	84%	36%	191,744
SAMN17255952	Spleen (Tachysurus fulvidraco, one year, SAMN17255952)	45,850,978	82%	41%	188,287
SAMN17255953	Spleen (Tachysurus fulvidraco, one year, SAMN17255953)	45,583,062	84%	39%	199,658
SAMN17255954	Spleen (Tachysurus fulvidraco, one year, SAMN17255954)	48,533,786	87%	48%	177,645
SAMN17255955	Spleen (Tachysurus fulvidraco, one year, SAMN17255955)	53,383,662	85%	43%	177,040
SAMN17255956	Spleen (Tachysurus fulvidraco, one year, SAMN17255956)	49,168,408	84%	41%	199,240
SAMN17255957	Spleen (Tachysurus fulvidraco, one year, SAMN17255957)	49,919,934	87%	42%	178,193
SAMN17255958	Spleen (Tachysurus fulvidraco, one year, SAMN17255958)	44,062,228	81%	36%	165,180
SAMN17255959	Spleen (Tachysurus fulvidraco, one year, SAMN17255959)	48,035,714	87%	45%	193,809
SAMN17255960	Spleen (Tachysurus fulvidraco, one year, SAMN17255960)	55,864,424	85%	37%	177,336
SAMN17255961	Spleen (Tachysurus fulvidraco, one year, SAMN17255961)	54,005,782	87%	42%	183,504
SAMN17255962	Spleen (Tachysurus fulvidraco, one year, SAMN17255962)	53,100,862	87%	39%	188,205
SAMN17255963	Spleen (Tachysurus fulvidraco, one year, SAMN17255963)	57,916,960	91%	31%	160,074
SAMN17255964	Spleen (Tachysurus fulvidraco, one year, SAMN17255964)	54,085,020	91%	21%	145,266
SAMN17255965	Spleen (Tachysurus fulvidraco, one year, SAMN17255965)	53,750,388	89%	51%	184,929
SAMN17255966	Spleen (Tachysurus fulvidraco, one year, SAMN17255966)	50,416,070	85%	38%	171,742
SAMN17255967	Spleen (Tachysurus fulvidraco, one year, SAMN17255967)	62,082,818	89%	44%	192,465
SAMN17255968	Spleen (Tachysurus fulvidraco, one year, SAMN17255968)	52,141,374	89%	48%	193,470
SAMN17255969	Spleen (Tachysurus fulvidraco, one year, SAMN17255969)	57,689,906	89%	50%	202,528
SAMN17255970	Spleen (Tachysurus fulvidraco, one year, SAMN17255970)	61,287,914	88%	48%	195,596
SAMN17255971	Spleen (Tachysurus fulvidraco, one year, SAMN17255971)	53,832,336	89%	47%	192,714
SAMN17255972	Spleen (Tachysurus fulvidraco, one year, SAMN17255972)	53,802,982	88%	43%	174,492
SAMN17255973	Spleen (Tachysurus fulvidraco, one year, SAMN17255973)	59,436,462	88%	44%	198,090
SAMN17255974	Spleen (Tachysurus fulvidraco, one year, SAMN17255974)	54,652,912	87%	43%	192,052
SAMN18642354	brain (Tachysurus fulvidraco, female, SAMN18642354)	49,008,450	89%	40%	227,469
SAMN18642355	brain (Tachysurus fulvidraco, female, SAMN18642355)	51,731,540	88%	37%	225,912
SAMN18642356	brain (Tachysurus fulvidraco, female, SAMN18642356)	50,036,114	88%	36%	225,791
SAMN18642357	brain (Tachysurus fulvidraco, male, SAMN18642357)	55,788,696	88%	39%	229,400
SAMN18642358	brain (Tachysurus fulvidraco, male, SAMN18642358)	54,904,252	88%	38%	230,690
SAMN18642359	brain (Tachysurus fulvidraco, male, SAMN18642359)	51,271,884	88%	38%	228,571
SAMN18642360	muscle (Tachysurus fulvidraco, female, SAMN18642360)	54,493,498	90%	57%	170,071
SAMN18642361	muscle (Tachysurus fulvidraco, female, SAMN18642361)	50,316,676	91%	56%	181,902
SAMN18642363	muscle (Tachysurus fulvidraco, male, SAMN18642363)	56,380,650	93%	55%	166,592
SAMN18642364	muscle (Tachysurus fulvidraco, male, SAMN18642364)	54,767,266	93%	59%	164,389
SAMN18642365	muscle (Tachysurus fulvidraco, male, SAMN18642365)	52,945,922	93%	58%	166,668
SAMN18827158	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827158)	59,539,818	70%	50%	196,676
SAMN18827159	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827159)	62,831,114	69%	48%	200,676
SAMN18827160	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827160)	102,674,116	69%	49%	211,343
SAMN18827161	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827161)	58,093,250	68%	48%	198,144
SAMN18827162	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827162)	61,258,786	70%	49%	191,831
SAMN18827163	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827163)	61,721,752	68%	47%	198,462
SAMN18827164	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827164)	56,024,176	69%	49%	202,220
SAMN18827165	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827165)	59,598,366	69%	50%	198,506
SAMN18827166	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827166)	52,742,924	75%	47%	205,765
SAMN18827167	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827167)	59,290,736	68%	49%	212,506
SAMN18827168	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827168)	53,909,448	69%	49%	200,192
SAMN18827169	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827169)	60,356,458	68%	45%	191,985
SAMN18827170	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827170)	51,746,322	69%	50%	199,340
SAMN18827171	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827171)	54,154,468	68%	50%	196,154
SAMN18827172	adult, head-kidney (Tachysurus fulvidraco, 1, SAMN18827172)	60,549,348	70%	53%	201,937
SAMN20348489	testis (Tachysurus fulvidraco, male, SAMN20348489)	122,351,280	89%	48%	255,352
SAMN20348708	testis (Tachysurus fulvidraco, male, SAMN20348708)	142,023,718	87%	47%	254,721
SAMN20348744	ovary (Tachysurus fulvidraco, female, SAMN20348744)	138,765,668	93%	56%	207,984

Show alignments statistics, by run (ERR, SRR, DRR)

Run	Experiment	Project	Sample	Number of reads	Percent aligned reads	Percent of aligned reads with introns
DRR319703	DRX309051	DRP007725	SAMD00406531	33,687,020	90%	63%
DRR319704	DRX309052	DRP007725	SAMD00406532	46,452,802	90%	63%
DRR319705	DRX309053	DRP007725	SAMD00406533	45,929,958	89%	62%
DRR319706	DRX309054	DRP007725	SAMD00406534	47,822,402	77%	58%
DRR319707	DRX309055	DRP007725	SAMD00406535	54,775,310	90%	61%
DRR319708	DRX309056	DRP007725	SAMD00406536	70,168,008	90%	61%
SRR1017548	SRX368267	SRP032172	SAMN02384151	1,202,933	76%	41%
SRR1630912	SRX743689	SRP049265	SAMN03142394	39,631,124	93%	42%
SRR1630913	SRX743689	SRP049265	SAMN03142394	46,416,046	84%	41%
SRR1630902	SRX743686	SRP049265	SAMN03142395	23,884,671	88%	32%
SRR1630908	SRX743686	SRP049265	SAMN03142395	20,000,000	84%	33%
SRR1630906	SRX743688	SRP049265	SAMN03142395	51,138,782	87%	32%
SRR1630909	SRX743688	SRP049265	SAMN03142395	44,970,338	84%	33%
SRR1630910	SRX743688	SRP049265	SAMN03142395	48,229,490	92%	38%
SRR1630911	SRX743688	SRP049265	SAMN03142395	39,845,286	85%	41%
SRR2002564	SRX1014127	SRP057554	SAMN03497639	44,243,514	77%	24%
SRR3198582	SRX1605866	SRP070973	SAMN04517535	87,268,036	87%	39%
SRR5458551	SRX2746578	SRP104276	SAMN06765420	25,485,026	86%	42%
SRR5458550	SRX2746577	SRP104276	SAMN06765421	25,700,595	87%	44%
SRR8648133	SRX5445916	SRP181791	SAMN10790437	45,566,650	88%	59%
SRR8648134	SRX5445915	SRP181791	SAMN10790438	48,509,818	87%	57%
SRR8648131	SRX5445918	SRP181791	SAMN10790439	50,642,386	87%	57%
SRR8648132	SRX5445917	SRP181791	SAMN10790440	45,047,268	87%	57%
SRR8648137	SRX5445912	SRP181791	SAMN10790441	33,789,114	87%	56%
SRR8648138	SRX5445911	SRP181791	SAMN10790442	50,415,682	87%	56%
SRR8648135	SRX5445914	SRP181791	SAMN10790443	56,646,574	87%	55%
SRR8648136	SRX5445913	SRP181791	SAMN10790444	49,652,854	87%	55%
SRR8648130	SRX5445919	SRP181791	SAMN10790445	43,650,726	85%	53%
SRR9595711	SRX6361424	SRP211948	SAMN12138640	78,071,786	87%	48%
SRR9595712	SRX6361423	SRP211948	SAMN12138641	65,426,346	86%	47%
SRR10226806	SRX6946356	SRP224106	SAMN12905434	67,221,512	88%	51%
SRR10226804	SRX6946358	SRP224106	SAMN12905436	60,918,976	88%	51%
SRR10226803	SRX6946359	SRP224106	SAMN12905438	57,812,956	89%	55%
SRR10226808	SRX6946360	SRP224106	SAMN12905440	53,341,062	88%	53%
SRR10226807	SRX6946355	SRP224106	SAMN12905442	52,391,528	86%	49%
SRR10226805	SRX6946357	SRP224106	SAMN12905444	64,949,770	89%	54%
SRR13089906	SRX9535807	SRP293339	SAMN16830887	60,386,892	94%	56%
SRR13089905	SRX9535808	SRP293339	SAMN16830888	63,608,634	94%	56%
SRR13089894	SRX9535819	SRP293339	SAMN16830889	65,439,092	94%	55%
SRR13089891	SRX9535822	SRP293339	SAMN16830890	66,953,676	88%	49%
SRR13089890	SRX9535823	SRP293339	SAMN16830891	56,855,198	87%	47%
SRR13089889	SRX9535824	SRP293339	SAMN16830892	65,476,654	86%	49%
SRR13089888	SRX9535825	SRP293339	SAMN16830893	65,859,346	93%	54%
SRR13089887	SRX9535826	SRP293339	SAMN16830894	66,059,828	94%	55%
SRR13089886	SRX9535827	SRP293339	SAMN16830895	58,008,810	87%	49%
SRR13089885	SRX9535828	SRP293339	SAMN16830896	64,412,334	87%	49%
SRR13089904	SRX9535809	SRP293339	SAMN16830897	53,868,720	74%	53%
SRR13089903	SRX9535810	SRP293339	SAMN16830898	59,756,950	87%	48%
SRR13089902	SRX9535811	SRP293339	SAMN16830899	65,815,682	92%	54%
SRR13089901	SRX9535812	SRP293339	SAMN16830900	69,950,122	92%	55%
SRR13089900	SRX9535813	SRP293339	SAMN16830901	65,551,886	94%	56%
SRR13089899	SRX9535814	SRP293339	SAMN16830902	49,771,980	73%	52%
SRR13089898	SRX9535815	SRP293339	SAMN16830903	64,737,982	74%	51%
SRR13089897	SRX9535816	SRP293339	SAMN16830904	59,062,464	75%	54%
SRR13089896	SRX9535817	SRP293339	SAMN16830905	60,001,798	80%	52%
SRR13089895	SRX9535818	SRP293339	SAMN16830906	71,990,608	89%	50%
SRR13089893	SRX9535820	SRP293339	SAMN16830907	63,642,792	88%	51%
SRR13089892	SRX9535821	SRP293339	SAMN16830908	61,781,736	89%	51%
SRR13385461	SRX9804701	SRP300845	SAMN17255927	54,936,058	90%	54%
SRR13385460	SRX9804702	SRP300845	SAMN17255928	43,228,912	90%	55%
SRR13385441	SRX9804720	SRP300845	SAMN17255929	46,031,806	91%	53%
SRR13385430	SRX9804731	SRP300845	SAMN17255930	52,607,134	89%	54%
SRR13385419	SRX9804742	SRP300845	SAMN17255931	55,406,810	88%	52%
SRR13385454	SRX9804708	SRP300845	SAMN17255932	75,135,220	89%	52%
SRR13385453	SRX9804709	SRP300845	SAMN17255933	41,583,944	90%	53%
SRR13385452	SRX9804710	SRP300845	SAMN17255934	57,448,692	90%	52%
SRR13385451	SRX9804711	SRP300845	SAMN17255935	52,921,372	91%	57%
SRR13385449	SRX9804712	SRP300845	SAMN17255936	55,483,338	90%	56%
SRR13385459	SRX9804703	SRP300845	SAMN17255937	48,944,908	89%	56%
SRR13385458	SRX9804704	SRP300845	SAMN17255938	49,276,672	89%	57%
SRR13385450	SRX9804705	SRP300845	SAMN17255939	41,039,960	91%	60%
SRR13385448	SRX9804713	SRP300845	SAMN17255940	58,398,948	90%	59%
SRR13385447	SRX9804714	SRP300845	SAMN17255941	57,317,136	91%	60%
SRR13385446	SRX9804715	SRP300845	SAMN17255942	59,029,490	91%	61%
SRR13385445	SRX9804716	SRP300845	SAMN17255943	56,526,354	91%	61%
SRR13385444	SRX9804717	SRP300845	SAMN17255944	68,236,198	90%	60%
SRR13385443	SRX9804718	SRP300845	SAMN17255945	58,687,178	92%	61%
SRR13385442	SRX9804719	SRP300845	SAMN17255946	50,884,676	92%	61%
SRR13385440	SRX9804721	SRP300845	SAMN17255947	51,765,446	92%	60%
SRR13385439	SRX9804722	SRP300845	SAMN17255948	52,949,400	91%	59%
SRR13385438	SRX9804723	SRP300845	SAMN17255949	52,490,710	91%	61%
SRR13385437	SRX9804724	SRP300845	SAMN17255950	52,183,800	91%	60%
SRR13385436	SRX9804725	SRP300845	SAMN17255951	45,380,016	84%	36%
SRR13385435	SRX9804726	SRP300845	SAMN17255952	45,850,978	82%	41%
SRR13385434	SRX9804727	SRP300845	SAMN17255953	45,583,062	84%	39%
SRR13385433	SRX9804728	SRP300845	SAMN17255954	48,533,786	87%	48%
SRR13385432	SRX9804729	SRP300845	SAMN17255955	53,383,662	85%	43%
SRR13385431	SRX9804730	SRP300845	SAMN17255956	49,168,408	84%	41%
SRR13385429	SRX9804732	SRP300845	SAMN17255957	49,919,934	87%	42%
SRR13385428	SRX9804733	SRP300845	SAMN17255958	44,062,228	81%	36%
SRR13385427	SRX9804734	SRP300845	SAMN17255959	48,035,714	87%	45%
SRR13385426	SRX9804735	SRP300845	SAMN17255960	55,864,424	85%	37%
SRR13385425	SRX9804736	SRP300845	SAMN17255961	54,005,782	87%	42%
SRR13385424	SRX9804737	SRP300845	SAMN17255962	53,100,862	87%	39%
SRR13385423	SRX9804738	SRP300845	SAMN17255963	57,916,960	91%	31%
SRR13385422	SRX9804739	SRP300845	SAMN17255964	54,085,020	91%	21%
SRR13385421	SRX9804740	SRP300845	SAMN17255965	53,750,388	89%	51%
SRR13385420	SRX9804741	SRP300845	SAMN17255966	50,416,070	85%	38%
SRR13385418	SRX9804743	SRP300845	SAMN17255967	62,082,818	89%	44%
SRR13385417	SRX9804744	SRP300845	SAMN17255968	52,141,374	89%	48%
SRR13385416	SRX9804745	SRP300845	SAMN17255969	57,689,906	89%	50%
SRR13385415	SRX9804746	SRP300845	SAMN17255970	61,287,914	88%	48%
SRR13385414	SRX9804747	SRP300845	SAMN17255971	53,832,336	89%	47%
SRR13385457	SRX9804748	SRP300845	SAMN17255972	53,802,982	88%	43%
SRR13385456	SRX9804706	SRP300845	SAMN17255973	59,436,462	88%	44%
SRR13385455	SRX9804707	SRP300845	SAMN17255974	54,652,912	87%	43%
SRR14213624	SRX10580112	SRP314481	SAMN18642354	49,008,450	89%	40%
SRR14213623	SRX10580113	SRP314481	SAMN18642355	51,731,540	88%	37%
SRR14213620	SRX10580116	SRP314481	SAMN18642356	50,036,114	88%	36%
SRR14213619	SRX10580117	SRP314481	SAMN18642357	55,788,696	88%	39%
SRR14213618	SRX10580118	SRP314481	SAMN18642358	54,904,252	88%	38%
SRR14213617	SRX10580119	SRP314481	SAMN18642359	51,271,884	88%	38%
SRR14213616	SRX10580120	SRP314481	SAMN18642360	54,493,498	90%	57%
SRR14213615	SRX10580121	SRP314481	SAMN18642361	50,316,676	91%	56%
SRR14213614	SRX10580122	SRP314481	SAMN18642363	56,380,650	93%	55%
SRR14213622	SRX10580114	SRP314481	SAMN18642364	54,767,266	93%	59%
SRR14213621	SRX10580115	SRP314481	SAMN18642365	52,945,922	93%	58%
SRR14306667	SRX10662066	SRP315953	SAMN18827158	59,539,818	70%	50%
SRR14306666	SRX10662067	SRP315953	SAMN18827159	62,831,114	69%	48%
SRR14306660	SRX10662073	SRP315953	SAMN18827160	102,674,116	69%	49%
SRR14306659	SRX10662074	SRP315953	SAMN18827161	58,093,250	68%	48%
SRR14306658	SRX10662075	SRP315953	SAMN18827162	61,258,786	70%	49%
SRR14306657	SRX10662076	SRP315953	SAMN18827163	61,721,752	68%	47%
SRR14306656	SRX10662077	SRP315953	SAMN18827164	56,024,176	69%	49%
SRR14306655	SRX10662078	SRP315953	SAMN18827165	59,598,366	69%	50%
SRR14306654	SRX10662079	SRP315953	SAMN18827166	52,742,924	75%	47%
SRR14306653	SRX10662080	SRP315953	SAMN18827167	59,290,736	68%	49%
SRR14306665	SRX10662068	SRP315953	SAMN18827168	53,909,448	69%	49%
SRR14306664	SRX10662069	SRP315953	SAMN18827169	60,356,458	68%	45%
SRR14306663	SRX10662070	SRP315953	SAMN18827170	51,746,322	69%	50%
SRR14306662	SRX10662071	SRP315953	SAMN18827171	54,154,468	68%	50%
SRR14306661	SRX10662072	SRP315953	SAMN18827172	60,549,348	70%	53%
SRR17999931	SRX14155568	SRP329957	SAMN20348489	40,721,218	89%	48%
SRR17999930	SRX14155569	SRP329957	SAMN20348489	39,790,532	89%	48%
SRR17999929	SRX14155570	SRP329957	SAMN20348489	41,839,530	89%	48%
SRR17999849	SRX14155486	SRP329957	SAMN20348708	46,292,628	88%	47%
SRR17999848	SRX14155487	SRP329957	SAMN20348708	48,980,416	88%	47%
SRR17999847	SRX14155488	SRP329957	SAMN20348708	46,750,674	86%	47%
SRR17999808	SRX14155445	SRP329957	SAMN20348744	45,001,046	93%	56%
SRR17999807	SRX14155446	SRP329957	SAMN20348744	45,993,308	92%	55%
SRR17999806	SRX14155447	SRP329957	SAMN20348744	47,771,314	93%	56%

Protein alignments

Source	Number of sequences retrieved from Entrez	Number (%) of sequences aligned by ProSplign	Number (%) of sequences passed to Gnomon	Average % identity	Average % coverage
Same-species GenBank	563	559 (99.29%)	559 (99.29%)	77.52%	85.13%
Pygocentrus nattereri high-quality model RefSeq (XP_)	19,896	19,552 (98.27%)	19,552 (98.27%)	70.11%	80.80%
Actinopterygii GenBank	89,106	83,800 (94.05%)	83,800 (94.05%)	69.09%	80.63%
Actinopterygii known RefSeq (NP_)	25,472	24,241 (95.17%)	24,241 (95.17%)	69.01%	80.05%
Danio rerio high-quality model RefSeq (XP_)	7,717	7,430 (96.28%)	7,430 (96.28%)	67.61%	76.83%
Ictalurus punctatus high-quality model RefSeq (XP_)	17,014	16,855 (99.07%)	16,855 (99.07%)	74.12%	83.68%
Oreochromis niloticus high-quality model RefSeq (XP_)	19,546	18,725 (95.80%)	18,725 (95.80%)	66.37%	76.14%
Homo sapiens known RefSeq (NP_)	63,852	53,306 (83.48%)	53,306 (83.48%)	67.53%	71.37%

Assembly-assembly alignments of current to previous assembly

When the assembly changes between two rounds of annotation, genes in the current and the previous annotation are mapped to each other using the genomic alignments of the current assembly to the previous assembly so that gene identifiers can be preserved. The success of the remapping depends largely on how well the two assembly versions align to each other.

Below are the percent coverage of one assembly by the other and the average percent identity of the alignments. The 'First pass' alignments are reciprocal best hits, while the 'Total' alignments also include 'Second pass' or non-reciprocal best alignments. For more information about the assembly-assembly alignment process, please visit the NCBI Genome Remapping Service page.

First Pass	Total
HZAU_PFXX_2.0 (Current) Coverage: 92.16%	HZAU_PFXX_2.0 (Current) Coverage: 94.18%
REGT01 (Previous) Coverage: 92.75%	REGT01 (Previous) Coverage: 94.55%
Percent Identity: 93.24%	Percent Identity: 93.30%

Comparison of the current and previous annotations

The annotation produced for this release (101) was compared to the annotation in the previous release (100) for each assembly annotated in both releases. Scores for current and previous gene and transcript features were calculated based on overlap in exon sequence and matches in exon boundaries. Pairs of current and previous features were categorized based on these scores, whether they are reciprocal best matches, and changes in attributes (gene biotype, completeness, etc.). If the assembly was updated between the two releases, alignments between the current and the previous assembly were used to match the current and previous gene and transcript features in mapped regions.

The table below summarizes the changes in the gene set for each assembly as a percent of the number of genes in the current annotation release, and provides links to the details of the comparison in tabular format and in a Genome Workbench project.

	HZAU_PFXX_2.0 (Current) to ASM372403v1 (Previous)
Identical	4%
Minor changes	60%
Major changes	11%
New	24%
Deprecated	13%
Other	2%
Download the report	tabular, Genome Workbench

References

RefSeq: Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, Farrell CM, Hart J, Landrum MJ, McGarvey KM, Murphy MR, O'Leary NA, Pujar S, Rajput B, Rangwala SH, Riddick LD, Shkeda A, Sun H, Tamez P, Tully RE, Wallin C, Webb D, Weber J, Wu W, Dicuccio M, Kitts P, Maglott DR, Murphy TD, Ostell JM. Nucleic Acids Research 2014, 42(Database issue):D756-63
BUSCO: Manni M, Berkeley MR, Seppey M, Simão FA, Zdobnov EM. Molecular biology and evolution 2021.38(10):4647-4654
RepeatMasker: Smit AFA, Hubley R, Green P. RepeatMasker Open-3.0. 1996–2004. http://www.repeatmasker.org
WindowMasker: Morgulis A, Gertz EM, Schäffer AA, Agarwala R. Bioinformatics 2006, 2:134-41
Splign: Kapustin Y, Souvorov A, Tatusova T, Lipman D. Biology Direct 2008, 3:20
Minimap2: Li H. Bioinformatics 2018 Sep 15;34(18):3094-3100

RefSeq

Integrated reference sequences