U.S. flag

An official website of the United States government

Format

Send to:

Choose Destination

Download Assembly



Thamnophis_sirtalis-6.0

Organism name:
Thamnophis sirtalis (snakes)
Isolate:
EDBJR-23777
Sex:
female
BioSample:
SAMN03759628
BioProject:
PRJNA189551
Submitter:
The Genome Institute at Washington University School of Medicine (WUGSC)
Date:
2015/06/26
Synonyms:
thaSir1
Assembly level:
Scaffold
Genome representation:
full
RefSeq category:
representative genome
GenBank assembly accession:
GCA_001077635.2 (latest)
RefSeq assembly accession:
GCF_001077635.1 (latest)
RefSeq assembly and GenBank assembly identical:
yes
WGS Project:
LFLD01
Assembly method:
ALLPATHS-LG v. May 2015
Genome coverage:
72x
Sequencing technology:
Illumina

IDs: 472161 [UID] 2207688 [GenBank] 2303698 [RefSeq]

See Genome Information for Thamnophis sirtalis

There are 3 assemblies for this organism

See more

History (Show revision history)

Comment

Background: The Common Garter snake, Thamnophis sirtalis, DNA for shotgun sequencing is derived from a adult female (Sample ID UTA R 62823; field tag EDBJR-23777) collected by Brian Gall in June 2010. The specific locality is Coffin Butte Rd., ... Benton Co. Oregon, 44 degrees 41 57.02 N, 123 degrees 13 18.06 W. Elevation 229 ft. The Willamette valley specimens are typically assigned to T. s. concinnus. The genomic DNA was extracted from skeletal muscle by the laboratory of Michael Pfender, Deptartment of Biological Sciences, University of Notre Dame. An estimated genome size of 1.5Gb was used for the construction of libraries. Total assembled sequence coverage of Illumina instrument reads was 72X, including 43X fragments, and 29X long insert reads. The combined sequence reads were assembled using the ALLPATHS-LG software (Gnerre 2011). Post assembly improvements included merging (GAA.pl, Yao 2011) the assembly with several other ALLPATHS-LG assemblies using different datasets. The contigs of the merged 6.0 assembly were reordered by L_RNA_scaffolder (Xue 2013); and SSPACE (Boetzer 2010) was used to further improve scaffolding. Finally, a custom script was used to close gaps. This 6.0 version has been screened and cleaned of contaminating contigs, and all contigs 200bp or smaller were removed. The assembly is made up of a total of 7930 scaffolds with an N50 scaffold length of 516,389bp, which includes singletons (single contigs scaffolds). The N50 contig length is 10,447bp. This assembly spans 1.44Mb including gaps, and singleton scaffolds. A CEGMA (Parra 2009) assessment of the assembly revealed 74.19% of the garter snake core eukaryote genes (CEGs) are complete, while 89.92% of the CEGs at least partially mapped to the assembly. The assembled sequences total 1.12Gb, which is almost 400Mb less than the estimated genome size. The small assembled genome size is likely a result of a large number repetitive sequences failing to assemble. The garter snake assembly was analyzed for repeats with Repeat Modeler (Smit and Hubley), which identified only 22.52% of the assembly as repeated sequence. The repeat content appears to have been underestimated, because of the failure of as much as 25% of the genome to assemble.
 This work was supported by an NHGRI grant to Richard K. Wilson. For questions regarding this T. sirtalis 6.0 assembly please contact Wesley C. Warren, wwarren@watson.wustl.edu, at The McDonnell Genome Institute.
 DNA samples can be obtained from:
 Michael Pfender Deptartment of Biological Sciences University of Notre Dame 109B Galvin Life Sciences Notre Dame, IN 46556

 Sequence and Assembly Credits:

 Source DNA - Michael Pfender, Deptartment of Biological Sciences, University of Notre Dame
 Genome Sequence - The McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO.
 Sequence Assembly - The McDonnell Genome Institute, Washington University School of Medicine, St Louis, MO.
 It is requested that users of this Thamnophis sirtalis 6.0 assembly acknowledge Richard K. Wilson and the McDonnell Genome Institute, Washington University School of Medicine that result from use of this sequence assembly.
 Assembly Statistics
 *** Contiguity: Contig *** Total contig number: 175994 Total contig bases: 1122756122 bp Average contig length: 6380 bp Maximum contig length: 171669 bp N50 contig length: 10447 bp N50 contig number: 28683
 *** Contiguity: Supercontig *** Total supercontig number: 7947 Average supercontig length: 141280 bp Maximum supercontig length: 3234367 bp N50 supercontig length: 516389 bp N50 supercontig number: 626
 *** Scaffold Distribution *** Scaffolds > 1M: 179 Scaffold 250K--1M: 1257 Scaffold 100K--250K: 1078 Scaffold 10--100K: 1814 Scaffold 5--10K: 414 Scaffold 2--5K: 836 Scaffold 0--2K: 2369  more

Global statistics

Total sequence length1,424,897,867
Total ungapped length1,122,701,795
Gaps between scaffolds0
Number of scaffolds7,930
Scaffold N50647,592
Scaffold L50639
Number of contigs175,977
Contig N5010,447
Contig L5028,683
Total number of chromosomes and plasmids0
Number of component sequences (WGS or clone)175,977

Supplemental Content

Recent activity

Your browsing activity is empty.

Activity recording is turned off.

Turn recording back on

See more...

Global assembly definition

Download the full sequence report
Click on the table row to see sequence details in the table to the right
Assembly Unit Name
Primary Assembly
The primary assembly unit does not have any assembled chromosomes or linkage groups.
Please download the full sequence report for information on the scaffolds.

Assembly statistics

MoleculeTotal
Length
Scaffold
Count
Ungapped
Length
Scaffold
N50
Spanned
Gaps
Unspanned
Gaps
unplaced1,424,897,8677,9301,122,701,795647,592168,0470