Landscape of insertion polymorphisms in the human genome

Genome Biol Evol. 2015 Mar 4;7(4):960-8. doi: 10.1093/gbe/evv043.

Abstract

Nucleotide substitutions, small (<50 bp) insertions or deletions (indels), and large (>50 bp) deletions are well-known causes of genetic variation within the human genome. We recently reported a previously unrecognized form of polymorphic insertions, termed templated sequence insertion polymorphism (TSIP), in which the inserted sequence was templated from a distant genomic region, and was inserted in the genome through reverse transcription of an RNA intermediate. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; class 1 TSIPs show target site duplication, polyadenylation, and preference for insertion at a 5'-TTTT/A-3' sequence, suggesting a LINE-1 based insertion mechanism, whereas class 2 TSIPs show features consistent with repair of a DNA double strand break by nonhomologous end joining. To gain a more complete picture of TSIPs throughout the human population, we evaluated whole-genome sequence from 52 individuals, and identified 171 TSIPs. Most individuals had 25-30 TSIPs, and common (present in >20% of individuals) TSIPs were found in individuals throughout the world, whereas rare TSIPs tended to cluster in specific geographic regions. The number of rare TSIPs was greater than the number of common TSIPs, suggesting that TSIP generation is an ongoing process. Intriguingly, mitochondrial sequences were a frequent template for class 2 insertions, used more commonly than any nuclear chromosome. Similar to single nucleotide polymorphisms and indels, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases, and can be useful in tracking historical migration of populations.

Keywords: DNA repair; LINE-1 retrotransposon; human migration; mitochondria; polymorphism; templated sequence insertion polymorphisms (TSIPs).

Publication types

  • Letter
  • Research Support, N.I.H., Intramural

MeSH terms

  • DNA, Mitochondrial / chemistry
  • Genome, Human*
  • Humans
  • Long Interspersed Nucleotide Elements
  • Polymorphism, Genetic*

Substances

  • DNA, Mitochondrial