GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM7767711

Query DataSets for GSM7767711

Status

Public on Jul 11, 2024

Title

mRNA_mESC_WT/D0_Ctrl_Rep1

Sample type

SRA

Source name

mESC

Organism

Mus musculus

Characteristics

cell line: mESC
cell type: Mouse embryonic stem cells
genotype: WT
treatment: control
fraction: polyA RNA

Growth protocol

The mESC cell line was cultured in a complete ESC culture medium, comprising Dulbecco's modified Eagle's medium (DMEM) supplemented with 15% heat-inactivated fetal bovine serum (FBS), 0.1 mM 2-mercaptoethanol, 2 mM L-glutamine, 0.1 mM non-essential amino acids, 1000 U/mL recombinant leukemia inhibitory factor (LIF), and 100 U/mL penicillin/streptomycin. For differentiation assays, mESC cells were cultured without LIF for six days. HeLa and HEK 293T cell lines were cultured in DMEM supplemented with 10% FBS and 100 U/mL penicillin/streptomycin.

Extracted molecule

polyA RNA

Extraction protocol

Total RNA extraction from the cell lines was accomplished using the RNA isolater total RNA extraction reagent (Vazyme, R401-01). Prior to size selection, RNA underwent DNase I treatment at 37 ℃ for 15 min and subsequently was purified using RNA isolater extraction and ethanol precipitation (Vazyme, EN401-01). The small RNA fraction, characterized by a size of less than 200nt, was enriched from the total RNA utilizing the RNA Clean & Concentrator-25 Kit (Zymo, R1018). For mRNA purification, two successive rounds of poly(A)+ selection were performed employing oligo(dT)25 dynabeads (Life Technologies, 61005). Additionally, chromatin-associated RNA was isolated following subcellular fractionation assay, DNase I treatment, and rRNA depletion (Vazyme, N406-01). Modification-free control RNAs were prepared from HeLa or mESC mRNA, adhering to a previously published protocol (PMID: 34594034, 36593412).
The RNA samples were initially dephosphorylated using T4 polynucleotide kinase (Vazyme, N102-01) at 37℃ for 1 hour, followed by ethanol precipitation. Subsequently, they were ligated to the 3' RNA linker using T4 RNA ligase 2, truncated at 25°C for 2 hours, followed by an overnight incubation at 16°C (Beyotime, R0635L). To remove excess adaptors, the samples underwent treatment with 5´ Deadenylase (NEB, M0331S), RecJf (NEB, M0264S), and Proteinase K (NEB, P8102). RNA samples intended for group comparisons were pooled and then purified through Tris-phenol extraction followed by ethanol precipitation. The barcoded and pooled samples were labeled by 5-(2-azidoethyl)-1,3-indandione (AI) and subsequently subjected to a click reaction. The m5C-modified RNA was subjected to RNA pull-down to be enriched, followed by reverse transcription using the HiScript III RT SuperMix (Vazyme, R323-01). The excess reverse transcription primer was digested with exonuclease I (NEB, M0293S). The cDNA was purified by silane beads and then performed to 5′ adapter ligation on beads with T4 RNA ligase 1, high concentration (NEB, M0437M). cDNA was purified using silane beads (Invitrogen, 37002D), and subsequently, the purified cDNA was amplified using the KAPA HiFi HotStart ReadyMix (Roche, KK2602). Finally, PCR products were purified using 1.8X VAHTS DNA Clean Beads (Vazyme, N411-02) and 7% TBE gel.

Library strategy

OTHER

Library source

transcriptomic

Library selection

other

Instrument model

DNBSEQ-T7

Data processing

For batch paired-end reads of m5C-seq libraries, a sample-barcoded-sequence (Pos 1-6) and random-4-base (pos7-10) in Read1 and a 10nt unique molecular identifier (UMI) was included in the adaptor that ligates to the 3’end of cDNA so it locates in Read2 (Pos1-10). Firstly, m5C-seq batch libraries were demultiplexed into individual samples, using fastq-multx（with -m 1 parameters）that distributes reads into a sample FASTQ file(Read1.fq file and Read2.fq file) according to the sample-barcoded-sequence in Read1 (Pos 1-6). For every individual sample, only Read2 data with UMI was used for subsequent analyses. Illumina sequencing reads were first treated with trim_galore(version 0.6.4) for adapter removal and quality trimming. Work commands are as follows: trim_galore --fastqc --phred33 --length 30. We then used Seqkit (version 0.10.0) to deduplicate based on 10 bp UMI at the 5′ end of reads R2, key process parameters are as follows: seqkit rmdup -s. Finally, Fastp (version 0.23.2) was used to remove the 10 bp UMI in the deduplication read. 10 bases at the 3′ end of inserted sequences were also removed, using the key parameters: fastp -f 10 -t 10 -l 20.
Processed reads were mapped to reference genome (human GRCh19 reference / mouse mm10 reference) using HISAT2-3N (version 2.2.1-3n-0.0.3). The key parameters are as follows: hisat3n --base-change C,T --directional-mapping --repeat -p 20 -k 1. Considering a creditable m5C signal, the alignments are further filtering using python house-scripts, the alignments hosts >=3 C-to-T mismatches or >= 3 other type mismatches or the length of alignments < 20nt are filtered. Next, we parse BAM file using pysamstats (version 1.1.2), with commands as follow: pysamstats -D 100000 -S "nofilter" -t variation.
To call m5C signal, C-to-T rate of each C nucleotide in the reference sequences was calculated for both treat (TET2 +) and control (TET2 –) samples. Two parameters were deﬁned to evaluate the dynamic change of C-to-T rate in the treat and control samples: difference of C-to-T rate (Diff) and fold change of C-to-T rate (FC). Diff was calculated by subtracting the mismatch rate in the treat sample from that in the corresponding control sample, while FC was calculated by dividing the C-to-T rate in the control sample that in the treat sample. C position was identiﬁed as an m5C site when the following criteria were met: (1) label sample coverage >= 20; (2) ctrl sample coverage >= 5; (3) C2T reads number >= 5 in both treat sample replications and C2T_reads_num >= 10 at least in one treat sample; (4) diff (treat sample -ctrl sample) >= 5% in both replications; (5) FC (treat sample/ctrl sample) >= 3 in both replications; (6) both control sample mismatch ratio <= 5%; (7) region around C position（+-50nt）mismatches number <= 6 in treat sample. To give our final m5C signal list, we overlapped two lists from two samples, only counted overlapped signals into our final m5C list.
To call m5C signal in KO samples, we took our final m5C list as a reference to find the C-to-T rate in KO samples. Then, we calculated the C-to-T rate fold change and difference between KO samples and WT samples. For enzyme-dependent sites, the C-to-T rate of KO sample is decrease to less than 60% of the WT-sample, the C-to-T rate difference (treat sample – control sample) is required to be >1.5%. To give our final KO-dependent sites list, we overlapped two lists from two KO repeats, and only counted overlapped signals into our final KO-dependent list.
Assembly: hg19(UCSC)/mm10(UCSC)/tRNA-reference(tRNAdb)
Supplementary files format and content: tab-delimited text file includes every C base coverage and C-to-T signal
Supplementary files format and content: excel file,in xls file format
Library strategy: m5C-seq

Submission date

Sep 08, 2023

Last update date

Jul 11, 2024

Contact name

Xiaoyu Li

E-mail(s)

xiaoyu_li@zju.edu.cn

Organization name

Zhejiang University

Department

SCHOOL OF BASIC MEDICAL SCIENCES

Lab

Li Lab

Street address

866 Yuhangtang Road

City

Hangzhou

State/province

Zhejiang

ZIP/Postal code

310058

Country

China

Platform ID

GPL28330

Series (1)

GSE242724

Base-resolution m5C profiling across the mammalian transcriptome by bisulfite-free enzyme-assisted chemical labeling approach

Relations

BioSample

SAMN37327683

SRA

SRX21677413

Supplementary file	Size	Download	File type/resource
GSM7767711_D0_Ctrl_rep1.merge.mpmat.simple.tidy.txt.gz	217.7 Mb	(ftp)(http)	TXT
SRA Run Selector
Raw data are available in SRA