|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jun 20, 2008 |
Title |
Variation in homeodomain DNA-binding revealed by high-resolution analysis of sequence preferences |
Platform organism |
synthetic construct |
Sample organism |
Mus musculus |
Experiment type |
Other
|
Summary |
Most homeodomains are unique within a genome, yet many are highly conserved across vast evolutionary distances, implying strong selection on their precise DNA-binding specificities. We determined the binding preferences of the majority (168) of mouse homeodomains to all possible 8-base sequences, revealing rich and complex patterns of sequence specificity, and showing for the first time that there are at least 65 distinct homeodomain DNA-binding activities. We developed a computational system that successfully predicts binding sites for homeodomain proteins as distant from mouse as Drosophila and C. elegans, and we infer full 8-mer binding profiles for the majority of known animal homeodomains. Our results provide an unprecedented level of resolution in the analysis of this simple domain structure and suggest that variation in sequence recognition may be a factor in its functional diversity and evolutionary success. Keywords: Mouse homeodomain protein binding microarrays
|
|
|
Overall design |
178 Protein binding microarray (PBM) experiments of mouse homeodomains were performed, with 10 proteins done in replicate. Briefly, the PBMs involved binding GST-tagged mouse homeodomains to custom-designed, double-stranded 44K Agilent microarrays in order to determine their sequence preferences. The method is described in Berger et al., Nature Biotechnology 2006. A key feature is that the microarrays are composed of de Bruijn sequences that contain each 10-base sequence once and only once, providing an evenly balanced sequence distribution. Individual de Bruijn sequences have different properties, including representation of gapped patterns. The array sequences as well as the primary array data are available via a EULA at http://the_brain.bwh.harvard.edu/pbms/webworks2/. Here we provide the data transformed into median intensities (after normalization and detrending of the original array data) for all 32,896 8-base sequences, Z-scores for these intensities, and E-scores. E-scores are a modified version of AUC, and describe how well each 8-mer ranks the intensities of the spots. In general the E-scores are slightly more reproducible than Z-scores, but contain less information about relative binding affinity. Additional experimental details are found in Berger et al., Nature Biotechnology 2006, Berger et al., Cell 2008, and the accompanying Supplementary information.
|
Web link |
http://hugheslab.ccbr.utoronto.ca/supplementary-data/homeodomains1/
|
|
|
Contributor(s) |
Berger MF, Badis G, Gehrke AR, Talukder S, Philippakis AA, Pena-Castillo L, Alleyne TM, Mnaimneh S, Jaeger S, Chan ET, Botvinnik OB, Khalid F, Zhang W, Newburger D, Morris QD, Bulyk ML, Hughes TR |
Citation(s) |
18585359 |
|
Submission date |
Apr 22, 2008 |
Last update date |
Mar 19, 2012 |
Contact name |
Lourdes Pena-Castillo |
E-mail(s) |
lourdes.pena@gmail.com
|
Phone |
416 946-7838
|
Fax |
416 978-8528
|
Organization name |
University of Toronto
|
Department |
Banting and Best Department of Medical Research
|
Lab |
Hughes Lab
|
Street address |
160 College St. Room 1350
|
City |
Toronto |
State/province |
Ontario |
ZIP/Postal code |
M5S 3E1 |
Country |
Canada |
|
|
Platforms (1) |
|
Samples (178)
|
|
Relations |
BioProject |
PRJNA106743 |
Supplementary data files not provided |
Processed data included within Sample table |
|
|
|
|
|