|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Oct 24, 2008 |
Title |
Yeast_Input_Reference_Set_ChIP-Seq |
Sample type |
SRA |
|
|
Source name |
Yeast_Input_Reference_Set_ChIP-Seq:
|
Organism |
Saccharomyces cerevisiae |
Characteristics |
Yeast strain CMY288-1B, mid-log exponential growth, Christopher M. Yellman, unpublished strain: CMY288-1B (MATα his3delta1 leu2delta0 lys2delta0 ura3delta0) Parent strain: BY4741 strain: CMY288-1B Genotype: MATα his3delta1 leu2delta0 lys2delta0 ura3delta0
|
Growth protocol |
CMY288-1B was grown in YPAD rich media to exponential mid-log phase (OD600=1.0, 500 mL culture).
|
Extracted molecule |
genomic DNA |
Extraction protocol |
Chromatin immunoprecipitations were performed as described in Aparicio et al. 2004. All ChIP experiments were completed as biological triplicates for Cse4 and Ste12 and as biological quadruplicates for PolII. After formaldehyde crosslinking, cells were lysed using a FastPrep24 (MP Biomedicals) and chromatin was sonicated using a Branson Digital Sonifier 450. For ChIP samples, clarified sonicated lysates were IPed in lysis/IP buffer containing protease inhibitors and PMSF. For input DNA, clarified sonicated lysates from CMY288-1B were left without antibody but were otherwise processed as a ChIP sample. For the immunoprecipitations of Ste12 and Cse4, anti-Myc EZiew affinity gel (Sigma) and anti-HA EZiew affinity gel (Sigma) were added to clarified lysates from Ste12 and Cse4 epitope-tagged strains respectively. For the immunoprecipitation of native RNA polymerase II from strain CMY288-1B, mouse ascites containing RNA polymerase II 8WG16 mouse monoclonal antibody (Covance, Cat. #MMS-126R) was added overnight and a pre-washed Protein G agarose slurry was used to precipitate antibody-PolII-DNA complexes. For non-barcoded samples, Illumina genomic DNA libraries were prepared according to Illumina manufacturer's instructions with a few modifications based on experience gained from ChIP-Sequencing described in Robertson et al. 2007. After end-repair and addition of a single adenosine nucleotide, non-barcoded Illumina genomic DNA adapters were ligated to the ChIP or input DNA sample. Then DNA was PCR-amplified with Illumina genomic DNA primers 1.1 and 2.1 and DNA fragments of the library between 150 and 350 bp were gel-extracted. The purified library was captured on one Illumina flowcell lane for cluster generation. Libraries were sequenced on the Illumina Genome Analyzer or on the Illumina Genome Analyzer II following the manufacturer's protocols. Here are our modifications to the library generation protocol in order to perform multiplex ChIP-Seq; other steps remained unchanged. We first generated standard Illumina genomic DNA adapter sequences augmented with one of four barcodes (ACGT, CATT, GTAT and TGCT). The first three bases uniquely tag or “barcode” a given sample preparation and are separated by a Hamming distance of three which will prevent one barcode being miscalled as another barcode when allowing for one- or two-base sequencing errors. These barcodes also display a balanced base composition. The final ‘T’ at the fourth base anneals with the ‘A’ overhang from the end-repaired DNA sample for ligation of ChIP or other DNA fragments. Illumina PCR primers for genomic DNA are used without any changes after adapter ligation to generate the barcoded sequencing libraries. Four libraries are pooled together in equimolar ratios and sequenced simultaneously on the Illumina Genome Analyzer or Illumina Genome Analyzer II in a single Illumina flowcell lane.
|
|
|
Library strategy |
ChIP-Seq |
Library source |
genomic |
Library selection |
ChIP |
Instrument model |
Illumina Genome Analyzer II |
|
|
Description |
Input reference data set for scoring ChIP samples. For scoring ChIP DNA relative to input DNA, a reference set consisting of two full lanes of non-barcoded input DNA (lanes I and J) and two barcoded input DNA data sets (ACGT from barcoded lane B and CATT from barcoded K, same sequencing mix than barcoded lane A) were combined, adding up to 13,198,172 total reads.
fastq raw data files: Sequence_Input_NB_laneI_FC301WV_20080419_s_1.txt Sequence_Input_NB_laneJ_FC201WVA_20080307_s_1.txt Sequence_barcoded_laneB_FC302RW_20080603_s_3.txt Sequence_barcoded_laneK_FC302RW_20080603_s_2.txt
|
Data processing |
Raw data from the Illumina Genome Analyzer and Illumina Genome Analyzer II were analyzed with Illumina’s Firecrest, Bustard and GERALD modules for image analysis, basecalling and run metrics respectively, and a PhiX174 control lane was used for matrix and phasing estimations, as per the manufacturer’s instructions. At this stage, a Perl script was used to partition the reads and remove the barcodes, i.e. the first four bases of each read. For barcoded samples, the next 26 bases of each read were aligned against the reference genome S288c Saccharomyces cerevisiae using Illumina’s ELAND program in standalone mode. For non-barcoded samples, bases 5 to 30 were aligned against the S288c genome for consistency. For each barcode from each flowcell lane of barcoded libraries, the numbers of total and mapped reads were determined. Reads lacking a fully-intact barcode were discarded in a fifth bin called unclassified and were not used in individual barcode mapping analysis, although they are calculated in the global lane mapping analysis. Uniquely-mapped reads from the same ChIP factor, with a similar barcoding scheme and from the same biologicial replicate were pooled to produce all the different data sets. A full lane of non-barcoded input DNA comprising 2,455,181 mapped reads was used as a control when scoring barcoded input DNA (Input_Experiment). For scoring ChIP DNA relative to input DNA, a reference set consisting of two full lanes of input DNA and two barcoded input DNA data sets were combined, adding up to 13,198,172 total reads (Input_Reference_Set). Signal files were created using a 200 bp sliding window and scoring was performed with the PeakSeek program (J Rozowsky, G Euskirchen, Z Zhang, et al., Peak-Seek: Scoring ChIP-Seq Experiments Relative to Controls. Nature Biotechnology. Submitted.) using a mappability fraction of 1.0. Significant “ChIP hits” were determined relative to the corresponding input DNA by PeakSeek and further filtered by requiring a hit length of at least 100 bp, a p-value of < 0.05, a ratio of at least 2.0 between ChIP DNA and input DNA read counts and a difference of at least 10 between the ChIP DNA and input DNA read counts.
|
|
|
Submission date |
Oct 23, 2008 |
Last update date |
Jun 11, 2013 |
Contact name |
Ghia Euskirchen |
E-mail(s) |
ghia.euskirchen@stanford.edu
|
Organization name |
Stanford University
|
Department |
Genetics
|
Lab |
Snyder
|
Street address |
1501 S. California Ave.
|
City |
Palo Alto |
State/province |
CA |
ZIP/Postal code |
94304 |
Country |
USA |
|
|
Platform ID |
GPL9377 |
Series (1) |
GSE13322 |
Efficient Yeast ChIP-Seq using Multiplex Short-Read DNA Sequencing |
|
Relations |
BioSample |
SAMN02195655 |
Supplementary file |
Size |
Download |
File type/resource |
GSM336336_Input_referenceset_eland_results_laneI_laneJ_laneB_ACGT_laneK_CATT.txt |
1.1 Gb |
(ftp)(http) |
TXT |
Raw data are available on Series record |
Processed data provided as supplementary file |
|
|
|
|
|