GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM2508012

Query DataSets for GSM2508012

Status

Public on Feb 27, 2017

Title

D2liver5

Sample type

SRA

Source name

Liver

Organism

Mus musculus

Characteristics

strain: DBA/2J
Sex: M
age in days: 345

Extracted molecule

genomic DNA

Extraction protocol

DNA was purified using the Qiagen AllPrep kit (http://www.qiagen.com) on the QIAcube system. NWe performed affinity-based enrichment using the MethylCap kit from Diagenode (https://www.diagenode.com). DNA in 110 µl TE buffer was ultrasonicated on a Covaris S2 machine (http://covarisinc.com) to an average fragment size of 300–400 bp. Enrichment was performed on 1µg of fragmented DNA according to the standard MethylCap protocol and captured DNA eluted in a single step using the High Elution Buffer.
Sequencing was done on the Life Technologies Ion Proton platform at the UTHSC Molecular Resource Center. The DNA fragments were used to prepare barcoded libraries employing the Ion XpressTM Plus Fragment Library Kit and the Ion XpressTM Barcode Kit from Life Technologies (http://www.thermofisher.com) according to protocol supplied by the manufacturer. After library preparation, the barcoded libraries were screened on an Agilent High Sensitivity DNA chip for size distribution. As an initial step, 1ul of each barcoded library was pooled and sequenced on an Ion Torrent PGM 314 chip. The read counts from the 314 chip were then used to prepare a final equalized pool. The final library pool was quantified by real-time polymerase chain reaction (PCR), used to prepare beads, and finally sequenced using the P1 chip on the Ion Torrent Proton sequencer. To avoid batch artifacts related to fragment library-processing steps and chip runs, we multiplexed the 11 samples and ran multiple chips to reach depth of ~30 million raw reads per sample. In total, seven P1 chips were used.

Library strategy

MBD-Seq

Library source

genomic

Library selection

MBD2 protein methyl-CpG binding domain

Instrument model

Ion Torrent Proton

Description

MBD-protein captured

Data processing

FASTQ files were processed with FastQC tools (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to generate chip level reports with statistics on quality score distributions, complexity, length distributions, and PCR duplication level. This ensures that there are fewer technical problems that might be difficult to correct at a later stage.
Low-quality ends of reads were trimmed with Trimmomatic v0.30 (Bolger et al., 2014) using a Phred score of 15 (Q15), a 10 bp sliding window and a minimal read length of 40 bp. Trimmed reads were aligned against the mouse mm10 reference genome (Ensemble GRCm38) using the TMAP (https://github.com/iontorrent/TMAP) aligner with default settings.
Samtools was used to sort and index aligned (bam) files (Li et al., 2009). Reads with mapping quality of less than 10 were filtered from the bam file using ‘samtools –q 10’ command. For each sample, there were seven high quality bam files (each from the seven separate runs) and these were merged and indexed using samtools.
The generated Bam files were then analyzed using the MEDIPS R package (Chavez et al., 2010). Sequenced reads were counted for every 500 bp non-overlapping window with normalization to the local CpG density (i.e., coupling factor CF). This process was done according to the following MEDIPS parameters: uniq=1, extend=300, shift=0, ws=500. This divided the mouse genome into 5451088 total bins with coverage counts at each bin. The read counts generated on the MEDIPS package were loaded to the EdgeR R package for differential methylation analysis (Robinson et al., 2010). Prior to statistical tests, we excluded the sex chromosomes. We also removed all bins with low coverage as these can provide no reliable statistics but will only add to the penalty for multiple testing. We filtered out regions with coupling factor (CF) below the genome-wide median of 3, and bins that had less than one count per million in 2 or more samples. This resulted in 466039 bins that have an average CpG count of 9.5 (minimum of 3 and maximum of 91) and mean count of 16.7 (minimum of 4 and maximum of 1167). We used this set of 466039 bins for statistical analysis
Genome_build: mm10
Supplementary_files_format_and_content: csv file; contains columns for bin name, chromosome (chr), start, end, CouplingFactor (CpG density), followed by qq columns with read counts for the 11 samples, and 11 columns with normalized relative methylation scores (rms) for 11 samples; after header rows, there are 466039 rows with methylation data

Submission date

Feb 24, 2017

Last update date

May 15, 2019

Contact name

Khyobeni Mozhui

E-mail(s)

kmozhui@uthsc.edu

Organization name

University of Tennessee Health Science Center

Department

Preventive Medicine