NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM6840555 Query DataSets for GSM6840555
Status Public on Jul 15, 2024
Title Adipocytes_Y_2
Sample type SRA
 
Source name Adipocytes
Organism Mus musculus
Characteristics cell type: Adipocytes
genotype: C57/Bl6 background
time: 2 months
Extracted molecule genomic DNA
Extraction protocol The tissues of 5 healthy animals were pooled per-replicate for cell isolation, with 3-5 replicates (totalling 15-25 animals) collected per cell type/condition. Multi-omic bulk profiling (OMNI-ATAC-seq assay and RNA-seq) was performed on purified populations collected via fluorescence activated cell sorting (FACS).
 OMNI-ATACseq libraries were generated from FACS purified cells according to the standard protocol (Corces et al., 2017). Afterward, libraries were purified using a QIAGEN MinElute PCR purification kit (QIAGEN, Cat#28004) followed by Agencourt AMPure XP beads (Beckman Coulter, Cat#A63880) according to the manufacturer’s recommendations. Library fragments ranging from 150 to 700 bp were enriched and the final elution volume was 21 ul. 
 
Library strategy ATAC-seq
Library source genomic
Library selection other
Instrument model DNBSEQ-G400
 
Data processing ATAC-seq data processing and alignment were completed using the Harvard pipeline (https://informatics.fas.harvard.edu/atac-seq-guidelines.html). The GRCm38.p6/mm10 genome primary assembly build used for alignment was obtained from the ensemble database (http://asia/ensembl.org/Musc_musculus/Info/Index; ftp://ftp.ensembl.org/pub/release-101/fasta/mus_musculus/dna/Mus_musculus.GRCm38.dna.primary_assembly.fa)
All fastq files were trimmed to remove the Illumina Nextera Transposase adapter sequence using Cutadapt v.2.4 with “-m 20” parameter (Martin, 2011). After trimming, FastQC v0.11.8 (https://www.bioinformatics.babraham.ac.uk/projects/fastqc/) was used to check overall sequence quality and evaluate proper adapter trimming. Bowtie2 (Langmead and Salzberg, 2012) was used to align reads to the GRCm38.p6/mm10 mouse reference genome using “-p 8, –very-sensitive” options. Picard tools (http://broadinstitute.githyb.io/picard/) were used to mark and remove duplicates using the MarkDuplicates tool with default options. All subsequent analyses were performed on deduplicated reads. Samtools (Li et al., 2009) was used to sort and obtain uniquely mapped reads using “-b -q 10” options. Samtools were also used to remove reads from the mitochondrial chromosome.
Finally, for visualization purposes, BAM files were converted to an index binary format bigWig using bedtools (Quinlan, 2014).
Peak calling was produced in accordance with Corces et al. 2018 to ensure reproducible high-quality fixed-with peaks with a minimal bias due to change in read depth (ranged from 12,870,021 to 121,543,847 ) and data quality (FROT ranged from 5.1 to 39.7). 
First, peak calling per sample was performed using the MACS2 callpeak command with the following parameters “-g mm –shift -100 –extsize 200 –nomodel –call-summits –nolambda –keep-dup all -p 0.01” (Zhang et al., 2008; Feng et al., 2012). Next, peak summits were extended by 250 bp on both sides to a final width of 501 bp. Peaks were filtered for mm10 blacklisted regions (Amemiya et al., 2019)(https://www.encodeproject.org/annotations/ENCSR636HFF/). To avoid overlapping peaks within sample because of the 250bp peak summit extension, we performed an iterative removal approach as in Corces et al. 2018. During this process non overlapping peaks are kept as well as the most significant peaks within an overlap, whereas direct overlapping peaks are disregarded. The process is performed in an iterative manner to avoid the removal of indirect peak overlap. This yielded resulted a set of fixed-width peaks per sample. Next, we normalized MACS2 “[−log10 (p-value)]” peak significance scores converting them to a score per million values (scpm). In brief, each individual peak score was divided by the sum of all the peak scores in the given sample divided by 1 million. This scpm value corrects the original peak calling scores for sample sequencing depth and quality, as higher quality samples yield a higher number of peaks and higher significance scores using MACS2. Thus, scores per million allow the direct comparison of peaks across biological replicates. Next, we sought to generate a cell-type specific peak set containing all reproducible peaks within a cell-type and aging stage. For that, we generated a cumulative peak set by combining all previous fixed-width peaks from a given cell-type and ageing stage and performed the previous iterative approach as performed per sample. Only peaks with a scpm >=1 in at least two samples (min. overlap 50%) were further considered, at the exception of Adipocytes and Neurons, where scpm >=3 was selected to increase signal-to-noise ratio due to some of their samples presented lower quality (I.e. FROT <=10). Next, we wanted to identify reproducible peaks within a cell-type and across ageing status. Therefore, we re-normalized the score per million per cell-type and ageing status and performed a cumulative peak which was again trimmed for overlap following the iterative approach mentioned above. This resulted in a set of reproducible, fixed-width per cell type () which we refer as cell-type specific. Lastly, we wanted to obtain a universal peak set across cell-types and ageing status to perform a pan-somatic ageing analysis. For that, we re-normalized the scpms per cell-type specific peak sets. Then, the 23 peak sets ere combined and again, we performed the iterative approach to remove overlapping peak sets across conditions. This resulted in a universal peak set of 460K, each of 501 bp in width.
Assembly: mm10
Supplementary files format and content: bigWig (.bw), narrowPeak, summits.bed, bed format with score per million score (scpm.bed), bed format and gtf format as well as the counts within the unviersal peak set.
 
Submission date Dec 15, 2022
Last update date Jul 15, 2024
Contact name Christian Nefzger
Organization name The University of Queensland
Department Institute for Molecular Biology
Lab Cellular Reprogramming and Ageing
Street address 306 Carmody Road
City Brisbane
State/province Queensland
ZIP/Postal code 4067
Country Australia
 
Platform ID GPL28457
Series (2)
GSE221034 Bulk Omni-ATAC-seq in young (2 months) and aged (22-24 months) mice across 23 cell-types
GSE223050 The activity of early-life gene regulatory elements is hijacked in aging through pervasive AP-1–linked chromatin opening
Relations
BioSample SAMN32298256
SRA SRX18756632

Supplementary file Size Download File type/resource
GSM6840555_Adipocytes_Y_2.bw 344.8 Mb (ftp)(http) BW
GSM6840555_Adipocytes_Y_2.narrowPeak.gz 2.7 Mb (ftp)(http) NARROWPEAK
GSM6840555_Adipocytes_Y_2.scpm.bed.gz 2.1 Mb (ftp)(http) BED
GSM6840555_Adipocytes_Y_2.summits.bed.gz 2.0 Mb (ftp)(http) BED
SRA Run SelectorHelp
Raw data are available in SRA
Processed data provided as supplementary file

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap