Expression profiling by high throughput sequencing Non-coding RNA profiling by high throughput sequencing Methylation profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing
Summary
Over the last 30 years the soil bacterium Agrobacterium tumefaciens has been the workhorse tool for plant genome engineering. Replacement of native tumor-inducing (Ti) plasmid elements with customizable cassettes enabled insertion of a sequence of interest as “Transfer DNA” (T-DNA) into the plant genome of interest. Although these T-DNA transfer mechanisms are well understood, detailed understanding of the structure and epigenomic status of insertion events was limited by current technologies. To fill this gap, we analyzed transgenic Arabidopsis thaliana lines from three widely used collections (SALK, SAIL and WISC) with two single molecule technologies, optical genome mapping and nanopore sequencing. Optical maps for four randomly selected T-DNA lines revealed between one and seven insertions/rearrangements with unexpectedly large sizes ranging from 27 to 236 kilobases. De novo nanopore-based genome assemblies for two heterozygous lines resolved T-DNA structures up to 36 kb and revealed large-scale T-DNA associated translocations and exchange of chromosome arm ends. The multiple internally rearranged nature of T-DNA arrays, consisting of identical T-DNA/backbone concatemers made full assembly even for long nanopore reads impossible. For the current TAIR10 reference genome, nanopore contigs corrected 83% of non-centromeric misassemblies. This unprecedented nucleotide-level definition of T-DNA insertions enabled the mapping of epigenome data. The SALK_059379 T-DNA insertions were enriched for 24nt siRNAs and contained dense cytosine DNA methylation. Transgene silencing via the RNA directed DNA methylation pathway was confirmed by in planta assays. In contrast, SAIL_232 T-DNA sequence was predominantly targeted by 21/22nt siRNAs, and DNA methylation and silencing was limited to the GUS gene, but not the resistance gene. With the emergence of genome editing technologies that rely on Agrobacterium for gene delivery, this study provides new insights into the structural impact of engineering plant genomes and demonstrates the utility of state-of-the-art long-range sequencing technologies to rapidly identify unanticipated genomic changes.
Overall design
2 biological samples, each RNAseq, smallRNAseq and bisulfite sequencing