Gene discovery using computational and microarray analysis of transcription in the Drosophila melanogaster testis

Genome Res. 2000 Dec;10(12):2030-43. doi: 10.1101/gr.10.12.2030.

Abstract

Identification and annotation of all the genes in the sequenced Drosophila genome is a work in progress. Wild-type testis function requires many genes and is thus of potentially high value for the identification of transcription units. We therefore undertook a survey of the repertoire of genes expressed in the Drosophila testis by computational and microarray analysis. We generated 3141 high-quality testis expressed sequence tags (ESTs). Testis ESTs computationally collapsed into 1560 cDNA set used for further analysis. Of those, 11% correspond to named genes, and 33% provide biological evidence for a predicted gene. A surprising 47% fail to align with existing ESTs and 16% with predicted genes in the current genome release. EST frequency and microarray expression profiles indicate that the testis mRNA population is highly complex and shows an extended range of transcript abundance. Furthermore, >80% of the genes expressed in the testis showed onefold overexpression relative to ovaries, or gonadectomized flies. Additionally, >3% showed more than threefold overexpression at p <0.05. Surprisingly, 22% of the genes most highly overexpressed in testis match Drosophila genomic sequence, but not predicted genes. These data strongly support the idea that sequencing additional cDNA libraries from defined tissues, such as testis, will be important tools for refined annotation of the Drosophila genome. Additionally, these data suggest that the number of genes in Drosophila will significantly exceed the conservative estimate of 13,601.

MeSH terms

  • Animals
  • Chromosome Mapping
  • Cluster Analysis
  • Computational Biology / methods*
  • Computational Biology / statistics & numerical data
  • Databases, Factual
  • Drosophila melanogaster / genetics*
  • Expressed Sequence Tags
  • Female
  • Gene Library
  • Genes, Insect / genetics*
  • Head
  • Male
  • Molecular Sequence Data
  • Oligonucleotide Array Sequence Analysis / methods*
  • Oligonucleotide Array Sequence Analysis / statistics & numerical data
  • Ovary / metabolism
  • Sequence Alignment
  • Testis / metabolism*
  • Transcription, Genetic / genetics*

Associated data

  • GENBANK/AI944400
  • GENBANK/AI944401
  • GENBANK/AI944402
  • GENBANK/AI944403
  • GENBANK/AI944404
  • GENBANK/AI944405
  • GENBANK/AI944406
  • GENBANK/AI944407
  • GENBANK/AI944408
  • GENBANK/AI944409
  • GENBANK/AI944410
  • GENBANK/AI944411
  • GENBANK/AI944412
  • GENBANK/AI944413
  • GENBANK/AI944414
  • GENBANK/AI944415
  • GENBANK/AI944416
  • GENBANK/AI944417
  • GENBANK/AI944418
  • GENBANK/AI944419
  • GENBANK/AI944420
  • GENBANK/AI944421
  • GENBANK/AI944422
  • GENBANK/AI944423
  • GENBANK/AI944424
  • GENBANK/AI944425
  • GENBANK/AI944426
  • GENBANK/AI944427
  • GENBANK/AI944428
  • GENBANK/AI944429