Here we scaffolded the wild camel genome assembly (GCF_000311805.1) using our improved dromedary camel genome assembly (CamDro3, GCA_000803125.3) as a reference.
We used CamDro3 in a reference-guided assembly strategy implemented with Ragout v. 2.0 (Kolmogorov et al., 2014) to upgrade the Camelus ferus (CamFer1, GCF_000311805.1, (Wang et al., 2012)) genome assembly to chromosome-level scale. Briefly, we used default settings in Progressive Cactus v. Github commit c4bed56c0cd48d23411038acb9c19bcae054837e (Paten et al., 2011a; Paten et al., 2011b) to generate HAL (hierarchical alignment format) alignments between CamDro3 and CamFer1, and then used Ragout with the refine and small synteny block settings to convert the alignment to FASTA, upgrading the CamFer1 assembly to CamFer2. Before alignment with Progressive Cactus, we repeat masked CamDro3 with RepeatMasker v. open-4.0.8 (http://www.repeatmasker.org) against the mammal repeats from RepBase RepeatMaskerEdition-20181026 (Jurka et al., 2005). We filled in gaps CamFer2 with GapFiller v. 3.0 (Boetzer & Pirovano, 2012) using default settings and BowTie (Langmead et al., 2009) as an aligner. The paired-end reads used to fill in gaps were the original Illumina short-reads used in assembly (SRA accession: SRR671683), which we trimmed with BBDuk v. 37.76 (https://sourceforge.net/projects/bbmap/), using the following settings: ktrim=r, k=23, mink=11, hdist=1, tpe, tbo, qtrim=rl, trimq=15. Less...