NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM7767711 Query DataSets for GSM7767711
Status Public on Jul 11, 2024
Title mRNA_mESC_WT/D0_Ctrl_Rep1
Sample type SRA
 
Source name mESC
Organism Mus musculus
Characteristics cell line: mESC
cell type: Mouse embryonic stem cells
genotype: WT
treatment: control
fraction: polyA RNA
Growth protocol The mESC cell line was cultured in a complete ESC culture medium, comprising Dulbecco's modified Eagle's medium (DMEM) supplemented with 15% heat-inactivated fetal bovine serum (FBS), 0.1 mM 2-mercaptoethanol, 2 mM L-glutamine, 0.1 mM non-essential amino acids, 1000 U/mL recombinant leukemia inhibitory factor (LIF), and 100 U/mL penicillin/streptomycin. For differentiation assays, mESC cells were cultured without LIF for six days. HeLa and HEK 293T cell lines were cultured in DMEM supplemented with 10% FBS and 100 U/mL penicillin/streptomycin.
Extracted molecule polyA RNA
Extraction protocol Total RNA extraction from the cell lines was accomplished using the RNA isolater total RNA extraction reagent (Vazyme, R401-01). Prior to size selection, RNA underwent DNase I treatment at 37 ℃ for 15 min and subsequently was purified using RNA isolater extraction and ethanol precipitation (Vazyme, EN401-01). The small RNA fraction, characterized by a size of less than 200nt, was enriched from the total RNA utilizing the RNA Clean & Concentrator-25 Kit (Zymo, R1018). For mRNA purification, two successive rounds of poly(A)+ selection were performed employing oligo(dT)25 dynabeads (Life Technologies, 61005). Additionally, chromatin-associated RNA was isolated following subcellular fractionation assay, DNase I treatment, and rRNA depletion (Vazyme, N406-01). Modification-free control RNAs were prepared from HeLa or mESC mRNA, adhering to a previously published protocol (PMID: 34594034, 36593412).
The RNA samples were initially dephosphorylated using T4 polynucleotide kinase (Vazyme, N102-01) at 37℃ for 1 hour, followed by ethanol precipitation. Subsequently, they were ligated to the 3' RNA linker using T4 RNA ligase 2, truncated at 25°C for 2 hours, followed by an overnight incubation at 16°C (Beyotime, R0635L). To remove excess adaptors, the samples underwent treatment with 5´ Deadenylase (NEB, M0331S), RecJf (NEB, M0264S), and Proteinase K (NEB, P8102). RNA samples intended for group comparisons were pooled and then purified through Tris-phenol extraction followed by ethanol precipitation. The barcoded and pooled samples were labeled by 5-(2-azidoethyl)-1,3-indandione (AI) and subsequently subjected to a click reaction. The m5C-modified RNA was subjected to RNA pull-down to be enriched, followed by reverse transcription using the HiScript III RT SuperMix (Vazyme, R323-01). The excess reverse transcription primer was digested with exonuclease I (NEB, M0293S). The cDNA was purified by silane beads and then performed to 5′ adapter ligation on beads with T4 RNA ligase 1, high concentration (NEB, M0437M). cDNA was purified using silane beads (Invitrogen, 37002D), and subsequently, the purified cDNA was amplified using the KAPA HiFi HotStart ReadyMix (Roche, KK2602). Finally, PCR products were purified using 1.8X VAHTS DNA Clean Beads (Vazyme, N411-02) and 7% TBE gel.
 
Library strategy OTHER
Library source transcriptomic
Library selection other
Instrument model DNBSEQ-T7
 
Data processing For batch paired-end reads of m5C-seq libraries, a sample-barcoded-sequence (Pos 1-6) and random-4-base (pos7-10) in Read1 and a 10nt unique molecular identifier (UMI) was included in the adaptor that ligates to the 3’end of cDNA so it locates in Read2 (Pos1-10). Firstly, m5C-seq batch libraries were demultiplexed into individual samples, using fastq-multx(with -m 1 parameters)that distributes reads into a sample FASTQ file(Read1.fq file and Read2.fq file) according to the sample-barcoded-sequence in Read1 (Pos 1-6). For every individual sample, only Read2 data with UMI was used for subsequent analyses. Illumina sequencing reads were first treated with trim_galore(version 0.6.4) for adapter removal and quality trimming. Work commands are as follows: trim_galore --fastqc --phred33 --length 30. We then used Seqkit (version 0.10.0) to deduplicate based on 10 bp UMI at the 5′ end of reads R2, key process parameters are as follows: seqkit rmdup -s. Finally, Fastp (version 0.23.2) was used to remove the 10 bp UMI in the deduplication read. 10 bases at the 3′ end of inserted sequences were also removed, using the key parameters: fastp -f 10 -t 10 -l 20.
Processed reads were mapped to reference genome (human GRCh19 reference / mouse mm10 reference) using HISAT2-3N (version 2.2.1-3n-0.0.3). The key parameters are as follows: hisat3n --base-change C,T --directional-mapping --repeat -p 20 -k 1. Considering a creditable m5C signal, the alignments are further filtering using python house-scripts, the alignments hosts >=3 C-to-T mismatches or >= 3 other type mismatches or the length of alignments < 20nt are filtered. Next, we parse BAM file using pysamstats (version 1.1.2), with commands as follow: pysamstats -D 100000 -S "nofilter" -t variation.
To call m5C signal, C-to-T rate of each C nucleotide in the reference sequences was calculated for both treat (TET2 +) and control (TET2 –) samples. Two parameters were defined to evaluate the dynamic change of C-to-T rate in the treat and control samples: difference of C-to-T rate (Diff) and fold change of C-to-T rate (FC). Diff was calculated by subtracting the mismatch rate in the treat sample from that in the corresponding control sample, while FC was calculated by dividing the C-to-T rate in the control sample that in the treat sample. C position was identified as an m5C site when the following criteria were met: (1) label sample coverage >= 20; (2) ctrl sample coverage >= 5; (3) C2T reads number >= 5 in both treat sample replications and C2T_reads_num >= 10 at least in one treat sample; (4) diff (treat sample -ctrl sample) >= 5% in both replications; (5) FC (treat sample/ctrl sample) >= 3 in both replications; (6) both control sample mismatch ratio <= 5%; (7) region around C position(+-50nt)mismatches number <= 6 in treat sample. To give our final m5C signal list, we overlapped two lists from two samples, only counted overlapped signals into our final m5C list.
To call m5C signal in KO samples, we took our final m5C list as a reference to find the C-to-T rate in KO samples. Then, we calculated the C-to-T rate fold change and difference between KO samples and WT samples. For enzyme-dependent sites, the C-to-T rate of KO sample is decrease to less than 60% of the WT-sample, the C-to-T rate difference (treat sample – control sample) is required to be >1.5%. To give our final KO-dependent sites list, we overlapped two lists from two KO repeats, and only counted overlapped signals into our final KO-dependent list.
Assembly: hg19(UCSC)/mm10(UCSC)/tRNA-reference(tRNAdb)
Supplementary files format and content: tab-delimited text file includes every C base coverage and C-to-T signal
Supplementary files format and content: excel file,in xls file format
Library strategy: m5C-seq
 
Submission date Sep 08, 2023
Last update date Jul 11, 2024
Contact name Xiaoyu Li
E-mail(s) xiaoyu_li@zju.edu.cn
Organization name Zhejiang University
Department SCHOOL OF BASIC MEDICAL SCIENCES
Lab Li Lab
Street address 866 Yuhangtang Road
City Hangzhou
State/province Zhejiang
ZIP/Postal code 310058
Country China
 
Platform ID GPL28330
Series (1)
GSE242724 Base-resolution m5C profiling across the mammalian transcriptome by bisulfite-free enzyme-assisted chemical labeling approach
Relations
BioSample SAMN37327683
SRA SRX21677413

Supplementary file Size Download File type/resource
GSM7767711_D0_Ctrl_rep1.merge.mpmat.simple.tidy.txt.gz 217.7 Mb (ftp)(http) TXT
SRA Run SelectorHelp
Raw data are available in SRA

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap