GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
Series GSE12349 Query DataSets for GSE12349
Status Public on Dec 26, 2008
Title A library of yeast transcription factor motifs
Platform organisms Saccharomyces cerevisiae; synthetic construct
Sample organism Saccharomyces cerevisiae
Experiment type Genome binding/occupancy profiling by genome tiling array
Summary The sequence specificity of DNA-binding proteins is the primary mechanism by which the cell recognizes genomic features. Here, we describe systematic determination of yeast transcription factor DNA-binding specificities. We obtained binding specificities for 112 DNA-binding proteins representing 19 distinct structural classes. One-third of the binding specificities have not been previously reported. Several binding sequences have striking genomic distributions relative to transcription start sites, supporting their biological relevance and suggesting a role in promoter architecture. Among these are Rsc3 binding sequences, containing the core CGCG, which are found preferentially ~100 bp upstream of transcription start sites. Mutation of RSC3 results in a dramatic increase in nucleosome occupancy in hundreds of proximal promoters containing a Rsc3 binding element, but has little impact on promoters lacking Rsc3 binding sequences, indicating that Rsc3 plays a broad role in targeting nucleosome exclusion at yeast promoters.

Keywords: Protein binding microarrays, DNA, proteins
Overall design Protein binding microarray (PBM), ChIP-chip and DIP-chip experiments of yeast transcription factor DNA-binding domains were performed. Briefly, the PBMs involved binding GST-tagged DNA-binding proteins to custom-designed, double-stranded 44K Agilent microarrays in order to determine their sequence preferences. The method is described in Berger et al., Nature Biotechnology 2006. A key feature is that the microarrays are composed of de Bruijn sequences that contain each 10-base sequence once and only once, providing an evenly balanced sequence distribution. Individual de Bruijn sequences have different properties, including representation of gapped patterns. Here we provide the data transformed into median intensities for all 32,896 8-base sequences, Z-scores for these intensities, and E-scores. E-scores are a modified version of AUC, and describe how well each 8-mer ranks the intensities of the spots. In general the E-scores are slightly more reproducible than Z-scores, but contain less information about relative binding affinity. Additional experimental details are found in Berger et al., Nature Biotechnology 2006, Berger et al., Cell 2008, and the accompanying Supplementary information. Raw 35-mer array data is available on the web link provided.

For the DIP-chip experiments [GSM345371, GSM345403, GSM345414-GSM345421, GSM345429-GSM345432], genomic DNA isolated from S288C yeast was incubated with 40nM of the MBP-tagged DNA binding domain (DBD) of either Cbf1, Pho2, Pho4, Leu3, Rap1, or Swi5 and incubated for 30 minutes prior to purification of protein-DNA complexes. The bound DNA was then isolated, amplified via Invitrogen's WGA protocol, and hybridized against input DNA on NimbleGen 385k 32bp-tiling whole genome arrays. ChIPOTle was used to identify peaks of binding from the data and motifs were identified by BioProspector and MDScan and then scored for their ability to predict the identified peaks by GOMER. Motifs with the best ROC AUC are reported in the paper.

For the ChIP-chip experiments [GSM346493 and GSM346494], isogenic wildtype and rsc3-1 strains carrying Rsc8-TAP were grown in parallel under rsc3-1 restrictive growth conditions (37°C). Following formaldehyde crosslinking, cells were homogenized and extracts were sonicated to shear the chromatin to an average size of ~500 bp. A single pulldown was then performed with IgG sepharose beads and after decrosslinking and LM-PCR amplification of purified IP DNA, samples were labeled and hybridized on Nimblegen 32bp whole genome tiling arrays, comparing the pulled-down DNA to input genomic DNA.
Web link
Contributor(s) Badis G, Chan ET, van Bakel H, Pena-Castillo L, Tillo D, Tsui K, Carlson CD, Gossett AJ, Hasinoff MJ, Warren CL, Gebbia M, Talukder S, Yang A, Mnaimneh S, Terterov D, Coburn D, Yeo AL, Yeo ZX, Clarke ND, Lieb JD, Ansari AZ, Nislow C, Hughes TR
Citation(s) 19111667
Submission date Aug 05, 2008
Last update date Mar 20, 2012
Contact name Lourdes Pena-Castillo
Phone 416 946-7838
Fax 416 978-8528
Organization name University of Toronto
Department Banting and Best Department of Medical Research
Lab Hughes Lab
Street address 160 College St. Room 1350
City Toronto
State/province Ontario
ZIP/Postal code M5S 3E1
Country Canada
Platforms (2)
GPL6796 UT/TH_all-8mer-v1
GPL7699 NimbleGen 385k S. cerevisiae 32bp tiling array
Samples (170)
GSM310122 ABF1_4505.2_ArrayA
GSM310123 ABF2_2116.1_ArrayA.1
GSM310124 ABF2_2116.1_ArrayB
BioProject PRJNA113175

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE12349_RAW.tar 72.6 Mb (http)(custom) TAR (of GFF, PAIR, TIFF)
GSE12349_README_mapping_probe_sequences.txt 2.8 Kb (ftp)(http) TXT
Processed data included within Sample table

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap