NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM336336 Query DataSets for GSM336336
Status Public on Oct 24, 2008
Title Yeast_Input_Reference_Set_ChIP-Seq
Sample type SRA
 
Source name Yeast_Input_Reference_Set_ChIP-Seq:
Organism Saccharomyces cerevisiae
Characteristics Yeast strain CMY288-1B, mid-log exponential growth, Christopher M. Yellman, unpublished
strain: CMY288-1B (MATα his3delta1 leu2delta0 lys2delta0 ura3delta0) Parent strain: BY4741
strain: CMY288-1B Genotype: MATα his3delta1 leu2delta0 lys2delta0 ura3delta0
Growth protocol CMY288-1B was grown in YPAD rich media to exponential mid-log phase (OD600=1.0, 500 mL culture).
Extracted molecule genomic DNA
Extraction protocol Chromatin immunoprecipitations were performed as described in Aparicio et al. 2004. All ChIP experiments were completed as biological triplicates for Cse4 and Ste12 and as biological quadruplicates for PolII. After formaldehyde crosslinking, cells were lysed using a FastPrep24 (MP Biomedicals) and chromatin was sonicated using a Branson Digital Sonifier 450. For ChIP samples, clarified sonicated lysates were IPed in lysis/IP buffer containing protease inhibitors and PMSF. For input DNA, clarified sonicated lysates from CMY288-1B were left without antibody but were otherwise processed as a ChIP sample. For the immunoprecipitations of Ste12 and Cse4, anti-Myc EZiew affinity gel (Sigma) and anti-HA EZiew affinity gel (Sigma) were added to clarified lysates from Ste12 and Cse4 epitope-tagged strains respectively. For the immunoprecipitation of native RNA polymerase II from strain CMY288-1B, mouse ascites containing RNA polymerase II 8WG16 mouse monoclonal antibody (Covance, Cat. #MMS-126R) was added overnight and a pre-washed Protein G agarose slurry was used to precipitate antibody-PolII-DNA complexes. For non-barcoded samples, Illumina genomic DNA libraries were prepared according to Illumina manufacturer's instructions with a few modifications based on experience gained from ChIP-Sequencing described in Robertson et al. 2007. After end-repair and addition of a single adenosine nucleotide, non-barcoded Illumina genomic DNA adapters were ligated to the ChIP or input DNA sample. Then DNA was PCR-amplified with Illumina genomic DNA primers 1.1 and 2.1 and DNA fragments of the library between 150 and 350 bp were gel-extracted. The purified library was captured on one Illumina flowcell lane for cluster generation. Libraries were sequenced on the Illumina Genome Analyzer or on the Illumina Genome Analyzer II following the manufacturer's protocols. Here are our modifications to the library generation protocol in order to perform multiplex ChIP-Seq; other steps remained unchanged. We first generated standard Illumina genomic DNA adapter sequences augmented with one of four barcodes (ACGT, CATT, GTAT and TGCT). The first three bases uniquely tag or “barcode” a given sample preparation and are separated by a Hamming distance of three which will prevent one barcode being miscalled as another barcode when allowing for one- or two-base sequencing errors. These barcodes also display a balanced base composition. The final ‘T’ at the fourth base anneals with the ‘A’ overhang from the end-repaired DNA sample for ligation of ChIP or other DNA fragments. Illumina PCR primers for genomic DNA are used without any changes after adapter ligation to generate the barcoded sequencing libraries. Four libraries are pooled together in equimolar ratios and sequenced simultaneously on the Illumina Genome Analyzer or Illumina Genome Analyzer II in a single Illumina flowcell lane.
 
Library strategy ChIP-Seq
Library source genomic
Library selection ChIP
Instrument model Illumina Genome Analyzer II
 
Description Input reference data set for scoring ChIP samples. For scoring ChIP DNA relative to input DNA, a reference set consisting of two full lanes of non-barcoded input DNA (lanes I and J) and two barcoded input DNA data sets (ACGT from barcoded lane B and CATT from barcoded K, same sequencing mix than barcoded lane A) were combined, adding up to 13,198,172 total reads.

fastq raw data files:
Sequence_Input_NB_laneI_FC301WV_20080419_s_1.txt
Sequence_Input_NB_laneJ_FC201WVA_20080307_s_1.txt
Sequence_barcoded_laneB_FC302RW_20080603_s_3.txt
Sequence_barcoded_laneK_FC302RW_20080603_s_2.txt
Data processing Raw data from the Illumina Genome Analyzer and Illumina Genome Analyzer II were analyzed with Illumina’s Firecrest, Bustard and GERALD modules for image analysis, basecalling and run metrics respectively, and a PhiX174 control lane was used for matrix and phasing estimations, as per the manufacturer’s instructions. At this stage, a Perl script was used to partition the reads and remove the barcodes, i.e. the first four bases of each read. For barcoded samples, the next 26 bases of each read were aligned against the reference genome S288c Saccharomyces cerevisiae using Illumina’s ELAND program in standalone mode. For non-barcoded samples, bases 5 to 30 were aligned against the S288c genome for consistency. For each barcode from each flowcell lane of barcoded libraries, the numbers of total and mapped reads were determined. Reads lacking a fully-intact barcode were discarded in a fifth bin called unclassified and were not used in individual barcode mapping analysis, although they are calculated in the global lane mapping analysis. Uniquely-mapped reads from the same ChIP factor, with a similar barcoding scheme and from the same biologicial replicate were pooled to produce all the different data sets. A full lane of non-barcoded input DNA comprising 2,455,181 mapped reads was used as a control when scoring barcoded input DNA (Input_Experiment). For scoring ChIP DNA relative to input DNA, a reference set consisting of two full lanes of input DNA and two barcoded input DNA data sets were combined, adding up to 13,198,172 total reads (Input_Reference_Set). Signal files were created using a 200 bp sliding window and scoring was performed with the PeakSeek program (J Rozowsky, G Euskirchen, Z Zhang, et al., Peak-Seek: Scoring ChIP-Seq Experiments Relative to Controls. Nature Biotechnology. Submitted.) using a mappability fraction of 1.0. Significant “ChIP hits” were determined relative to the corresponding input DNA by PeakSeek and further filtered by requiring a hit length of at least 100 bp, a p-value of < 0.05, a ratio of at least 2.0 between ChIP DNA and input DNA read counts and a difference of at least 10 between the ChIP DNA and input DNA read counts.
 
Submission date Oct 23, 2008
Last update date Jun 11, 2013
Contact name Ghia Euskirchen
E-mail(s) ghia.euskirchen@stanford.edu
Organization name Stanford University
Department Genetics
Lab Snyder
Street address 1501 S. California Ave.
City Palo Alto
State/province CA
ZIP/Postal code 94304
Country USA
 
Platform ID GPL9377
Series (1)
GSE13322 Efficient Yeast ChIP-Seq using Multiplex Short-Read DNA Sequencing
Relations
BioSample SAMN02195655

Supplementary file Size Download File type/resource
GSM336336_Input_referenceset_eland_results_laneI_laneJ_laneB_ACGT_laneK_CATT.txt 1.1 Gb (ftp)(http) TXT
Raw data are available on Series record
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap