This project provides the genome assembly of the Tasmanian devil (Sarcophilus harrisii). DNA was extracted from a fibroblast cell line established from a healthy female animal, provided by Elizabeth Murchison and Max Stammnitz (Department of Veterinary Medicine, University of Cambridge, UK). The assembly is provided by the Wellcome Sanger Institute and University of Cambridge team of the Vertebrate Genomes Project (https://www.sanger.ac.uk/science/data/vertebrate-genomes-sequencing).
Data citation
For use of these resources, please cite: The evolution of two transmissible cancers in Tasmanian devils (Stammnitz et al. 2023, Science 380:6642)
Comment
The assembly mSarHar1.11 is based on ~88x ONT data including 10x ultra long reads sequenced at Oxford Nanopore Technologies; 60x 10X Genomics Chromium data, BioNano data, ~50x Illumina HiSeq XTen and 60x Dovetail Hi-C data generated at the Wellcome Sanger Institute. The assembly process included the following sequence of steps: initial ONT assembly generation with WTDBG, 10X based scaffolding with scaff10x, BioNano hybrid-scaffolding, Hi-C based scaffolding with scaffHiC, WTDBG2/Racon polishing, and two rounds of FreeBayes polishing. Finally, the assembly was analysed and manually improved using gEVAL, where haplotigs have been removed. Chromosome-scale scaffolds have been confirmed using the Hi-C data. Chromosomes are named according to established convention and the labels for chromosomes 1 and 2 are switched compared with a previous genome assembly (Devil7.0) (i.e. the chromosome labelled 1 in Devil7.0 is labelled 2 in the current assembly, and vice versa). Two contigs derived from the Y chromosome from a different individual, and sequenced with Illumina HiSeq 2000, have been added. The MT sequence from Devil7.0 has also been included.
Less...