Sample GSM4974126 Query DataSets for GSM4974126
Status Public on Dec 15, 2020
Title RM01_GZ_CTCF_Input
Sample type SRA
Source name Brain
Organism Macaca mulatta
Characteristics tissue: Brain
developmental stage: E84
genotype: wild type
Extracted molecule genomic DNA
Extraction protocol Cells were fixed by final concentration of 1% formaldehyde (Sigma-Aldrich, F1635), mixed for 10 mins at room temperature. Fixation was quenched using 2.5 M glycine at room temperature for 5 mins and immediately centrifuged at 800g for 10 mins. Supernatant was removed and frozen in liquid nitrogen and stored at −80°C until further use.
Five million fixed cells were lysed for 10min, and then sonicated for 20 cycles using Bioruptor (Diagenode). After sonication, chromatin was precleared using Protein A dynabeads. CTCF primary antibody-beads added to the precleared chromatin and incubated overnight at 4°C with rotation. Beads were then washed, extract the DNA, end repair, add A, add adaptor reaction and PCR amplification, DNA products size selection were performed step by step.
Library strategy ChIP-Seq
Library source genomic
Library selection ChIP
Instrument model Illumina HiSeq 2000
Data processing Hi-C:Paired-end reads of the Hi-C libraries were pre-processed using HiC-Pro. In brief, reads pairs were aligned to the reference genomes ( rheMac8 ) in two steps. The first step aligns reads using bowtie2 end-to-end algorithm. The second step detect the ligation sites and the 5’ fraction of the reads were aligned back to the genome. Aligned read pairs were assigned to DpnII restriction fragments. Invalid pairs like dangling end were filtered out. The genome was divided into specific size bins, and valid pairs were counted per bins. We used 500 kb, 100 kb, 40 kb bin sizes to generate raw and ICE or KR normalized matrix.
RNA-seq:The intra-species RNA-Seq Fastq reads were aligned to Mmul_8.0.1 reference genome using STAR. ENSEMBL gene annotation was used. Duplicated alignments were removed using PicardTools. The number of alignments mapped to exons was counted using the summarizeOverlap func-tion from the GenomicAlignments package. Condi-tional quantile normalization was performed using CQN package to correct read depth, exon length and GC content. Log2 normalized FPKM was used for quantification of expression levels.
ChIP-seq:The Fastq reads were aligned to reference genome (rheMac8) using bowtie2 with settings “bowtie2 –very-sensitive”. Bam files were transferred to sam files using Samtools. Duplicated alignments were removed using PicardTools. To avoid double contigs for the paired-end alignments, only one mate count. Only uniquely mapped alignments (MAPQ > 30) were kept. Averaged reads per million of mapped reads (RPM) in 10bp bins were calculated by deepTools and transferred to coverage tracks. Technical replicates data were merged. Peaks were identified using MACS2 (for CTCF ChIP-seq) with settings “macs2 call Peak –B –p 1e-5 –nomodle –extsize 200” or HOMER (for DNase-seq) with setting "-size 147 -gsize 2.7e09 -norm 1000000 -fragLength 147 -fdr 0.01 -style factor".
Genome_build: Mmul_8.0.1
Supplementary_files_format_and_content: Hi-C: (1) .hic files represent all the valid paris per sample and can be used for further analysis
Supplementary_files_format_and_content: (2) .bed files represent all the boundaries per sample generated by insulation score method
Supplementary_files_format_and_content: (3) .bedpe files represent all the loops per sample generated by HICCUPS
Supplementary_files_format_and_content: RNA-seq: matrix table text file represent the log2 FPKM value per sample
Supplementary_files_format_and_content: ChIP-seq: (1) .bw files represent the reads distribution
Supplementary_files_format_and_content: (2). txt files represent deteced peaks using MACS2 or HOMER
Submission date Dec 14, 2020
Last update date Dec 19, 2020
Contact name Yuting Liu
Organization name Peking University
Department School of Life Science
Lab Cheng Li
Street address Haidian District, Beijing Summer Palace Road No. 5
City Beijing
State/province Beijing
ZIP/Postal code 100871
Country China
Platform ID GPL14954
Series (1)
GSE163177 3D Genome of macaque fetal brain reveals evolutionary innovations during primate corticogenesis
BioSample SAMN17078341
SRA SRX9684052

Supplementary file Size Download File type/resource 179.6 Mb (ftp)(http) BW
Raw data are available in SRA
Processed data provided as supplementary file

