NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE11239 Query DataSets for GSE11239
Status Public on Jun 20, 2008
Title Variation in homeodomain DNA-binding revealed by high-resolution analysis of sequence preferences
Platform organism synthetic construct
Sample organism Mus musculus
Experiment type Other
Summary Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity, and showing for the first time that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success.
Keywords: Mouse homeodomain protein binding microarrays
 
Overall design 178 Protein binding microarray (PBM) experiments of mouse homeodomains were performed, with 10 proteins done in replicate. Briefly, the PBMs involved binding GST-tagged mouse homeodomains to custom-designed, double-stranded 44K Agilent microarrays in order to determine their sequence preferences. The method is described in Berger et al., Nature Biotechnology 2006. A key feature is that the microarrays are composed of de Bruijn sequences that contain each 10-base sequence once and only once, providing an evenly balanced sequence distribution. Individual de Bruijn sequences have different properties, including representation of gapped patterns. The array sequences as well as the primary array data are available via a EULA at http://the_brain.bwh.harvard.edu/pbms/webworks2/. Here we provide the data transformed into median intensities (after normalization and detrending of the original array data) for all 32,896 8-base sequences, Z-scores for these intensities, and E-scores. E-scores are a modified version of AUC, and describe how well each 8-mer ranks the intensities of the spots. In general the E-scores are slightly more reproducible than Z-scores, but contain less information about relative binding affinity. Additional experimental details are found in Berger et al., Nature Biotechnology 2006, Berger et al., Cell 2008, and the accompanying Supplementary information.
Web link http://hugheslab.ccbr.utoronto.ca/supplementary-data/homeodomains1/
 
Contributor(s) Berger MF, Badis G, Gehrke AR, Talukder S, Philippakis AA, Pena-Castillo L, Alleyne TM, Mnaimneh S, Jaeger S, Chan ET, Botvinnik OB, Khalid F, Zhang W, Newburger D, Morris QD, Bulyk ML, Hughes TR
Citation(s) 18585359
Submission date Apr 22, 2008
Last update date Mar 19, 2012
Contact name Lourdes Pena-Castillo
E-mail(s) lourdes.pena@gmail.com
Phone 416 946-7838
Fax 416 978-8528
Organization name University of Toronto
Department Banting and Best Department of Medical Research
Lab Hughes Lab
Street address 160 College St. Room 1350
City Toronto
State/province Ontario
ZIP/Postal code M5S 3E1
Country Canada
 
Platforms (1)
GPL6796 UT/TH_all-8mer-v1
Samples (178)
GSM285364 Alx3_3418.2
GSM285365 Alx4_1744.1
GSM285366 Arx_1738.2
Relations
BioProject PRJNA106743

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary data files not provided
Processed data included within Sample table

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap