 |
 |
GEO help: Mouse over screen elements for information. |
|
Status |
Public on May 24, 2012 |
Title |
Chromatin State Segmentation by HMM from ENCODE/Broad |
Project |
ENCODE
|
Organism |
Homo sapiens |
Experiment type |
Genome binding/occupancy profiling by high throughput sequencing
|
Summary |
This data was generated by ENCODE. If you have questions about the data, contact the submitting laboratory directly (mailto:jernst@mit.edu). If you have questions about the Genome Browser track associated with this data, contact ENCODE (mailto:genome@soe.ucsc.edu).
This track displays a chromatin state segmentation for each of nine human cell types (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=GM12878,H1-hESC,HepG2,HUVEC,HMEC,HSMM,K562,NHEK,NHLF). A common set of states across the cell types were learned by computationally integrating ChIP-seq data for nine factors plus input (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=CTCF,H3K4me1,H3K4me2,H3K4me3,H3K27ac,H3K9ac,H3K36me3,H4K20me1,H3K27me3,Input) using a Hidden Markov Model (HMM). In total, fifteen states were used to segment the genome, and these states were then grouped and colored to highlight predicted functional elements.
For data usage terms and conditions, please refer to http://www.genome.gov/27528022 and http://www.genome.gov/Pages/Research/ENCODE/ENCODEDataReleasePolicyFinal2008.pdf
|
|
|
Overall design |
ChIP-seq data from the Broad Histone (http://hgwdev.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&g=wgEncodeBroadChipSeq) track was used to generate this track. Data for nine factors plus input (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=CTCF,H3K4me1,H3K4me2,H3K4me3,H3K27ac,H3K9ac,H3K36me3,H4K20me1,H3K27me3,Input) and nine cell types (http://hgwdev.cse.ucsc.edu/cgi-bin/hgEncodeVocab?term=GM12878,H1-hESC,HepG2,HUVEC,HMEC,HSMM,K562,NHEK,NHLF) was binarized separately at a 200 base pair resolution based on a Poisson background model. The chromatin states were learned from this binarized data using a multivariate Hidden Markov Model (HMM) that explicitly models the combinatorial patterns of observed modifications (Ernst and Kellis, 2010). To learn a common set of states across the nine cell types, first the genomes were concatenated across the cell types. For each of the nine cell types, each 200 base pair interval was then assigned to its most likely state under the model. Detailed information about the model parameters and state enrichments can be found in (Ernst et al, accepted). This is release 1 (Jun 2011) of this track, and it is based on the NCBI36/hg18 release of the Broad Histone (http://hgwdev.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg18&g=wgEncodeBroadChipSeq) track. This track has also been lifted over to GRCh37/hg19 (http://hgwdev.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeBroadHmm). It is anticipated that the HMM methods will be run on the newer GRCh37/hg19 Broad Histone (http://hgwdev.cse.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeBroadHistone) data and will replace the lifted version.
|
Web link |
http://genome.ucsc.edu/cgi-bin/hgTrackUi?db=hg19&g=wgEncodeBroadHmm
|
|
|
Contributor(s) |
Ernst J, Kellis M, Bernstein B |
Citation missing |
Has this study been published? Please login to update or notify GEO. |
BioProject |
PRJNA63443 |
|
Submission date |
May 23, 2012 |
Last update date |
May 15, 2019 |
Contact name |
ENCODE DCC |
E-mail(s) |
encode-help@lists.stanford.edu
|
Organization name |
ENCODE DCC
|
Street address |
300 Pasteur Dr
|
City |
Stanford |
State/province |
CA |
ZIP/Postal code |
94305-5120 |
Country |
USA |
|
|
Platforms (1) |
GPL9052 |
Illumina Genome Analyzer (Homo sapiens) |
|
Samples (9)
|
|
Supplementary file |
Size |
Download |
File type/resource |
GSE38163_RAW.tar |
50.3 Mb |
(http)(custom) |
TAR (of BED) |
SRA Run Selector |
Raw data provided as supplementary file |
Processed data provided as supplementary file |
|
|
|
|
 |