Closing gaps in the human genome with fosmid resources generated from multiple individuals

Nat Genet. 2008 Jan;40(1):96-101. doi: 10.1038/ng.2007.34. Epub 2007 Dec 23.

Abstract

The human genome sequence has been finished to very high standards; however, more than 340 gaps remained when the finished genome was published by the International Human Genome Sequencing Consortium in 2004. Using fosmid resources generated from multiple individuals, we targeted gaps in the euchromatic part of the human genome. Here we report 2,488,842 bp of previously unknown euchromatic sequence, 363,114 bp of which close 26 of 250 euchromatic gaps, or 10%, including two remaining euchromatic gaps on chromosome 19. Eight (30.7%) of the closed gaps were found to be polymorphic. These sequences allow complete annotation of several human genes as well as the assignment of mRNAs. The gap sequences are 2.3-fold enriched in segmentally duplicated sequences compared to the whole genome. Our analysis confirms that not all gaps within 'finished' genomes are recalcitrant to subcloning and suggests that the paired-end-sequenced fosmid libraries could prove to be a rich resource for completion of the human euchromatic genome.

Publication types

  • Research Support, N.I.H., Extramural
  • Research Support, Non-U.S. Gov't

MeSH terms

  • Base Sequence
  • Chromosomes, Human, Pair 19*
  • Cloning, Molecular
  • Euchromatin
  • Gene Library
  • Genetic Vectors
  • Genome, Human*
  • Human Genome Project
  • Humans
  • Molecular Sequence Data
  • Polymorphism, Genetic

Substances

  • Euchromatin

Associated data

  • GENBANK/AC154091
  • GENBANK/AC154114
  • GENBANK/AC155072
  • GENBANK/AC156158
  • GENBANK/AC156159
  • GENBANK/AC156789
  • GENBANK/AC157210
  • GENBANK/AC157321
  • GENBANK/AC158211
  • GENBANK/AC160849
  • GENBANK/AC160851
  • GENBANK/AC160854
  • GENBANK/AC160855
  • GENBANK/AC160856
  • GENBANK/AC160857
  • GENBANK/AC160858
  • GENBANK/AC160860
  • GENBANK/AC160862
  • GENBANK/AC161035
  • GENBANK/AC161429
  • GENBANK/AC174049
  • GENBANK/AC174074
  • GENBANK/AC174156
  • GENBANK/AC174157
  • GENBANK/AC174438
  • GENBANK/AC174439
  • GENBANK/AC174441
  • GENBANK/AC186562
  • GENBANK/AC188045
  • GENBANK/AC188046
  • GENBANK/AC195454
  • GENBANK/AC195455
  • GENBANK/AC196364
  • GENBANK/AC196636