GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4282755

Query DataSets for GSM4282755

Status

Public on Jan 12, 2021

Title

MEL_LRS_HBB_IVS110_3

Sample type

SRA

Source name

MEL cells

Organism

Mus musculus

Characteristics

cell type: Murine Erythroleukemia cells (MEL)
erythroid induction: 2% DMSO 5 days
purification: chromatin fractionation
genotype: HBB IVS-110 (G>A) integrated

Treatment protocol

To induce erythroid differentiation, MEL cells were diluted to 50,000 cells/ml in 10 ml fresh culture medium and incubated in growth conditions for 16 hours. DMSO was then added directly to the culture medium to a final concentration of 2% and incubated in growth conditions for 5 days.

Growth protocol

Murine Erythroleukemia cells (MEL) were maintained at 37°C and 5% CO2 in DMEM + Glutamax medium (GIBCO) containing 100 U/ml penicillin, 100 μg/ml streptomycin (GIBCO), and 10% fetal bovine serum (GIBCO).

Extracted molecule

total RNA

Extraction protocol

Cells were fractionated according to the protocol published in (Pandya-Jones and Black, 2009), with modifications to centrifugation speeds in order to retain intact nuclei. All steps were performed on ice, and all buffers contained 25 uM α-amanitin, 40 U/ml SUPERase.IN, and 1x Roche cOmplete protease inhibitor mix. Briefly, 20 million cells were rinsed once with PBS/1 mM EDTA, then lysed in 250 μl cytoplasmic lysis buffer (10 mM Tris-HCl pH 7.5, 0.05% NP40, 150 mM NaCl) by gently resuspending then incubating on ice for 5 minutes. Lysate was then layered on top of a 500 μl cushion of 24% sucrose in cytoplasmic lysis buffer and spun at 2000 rpm for 10 min at 4°C. The supernatant (cytoplasm fraction) was removed, and the pellet (nuclei) were rinsed once with 500 μl PBS/1 mM EDTA. Nuclei were resuspended in 100 μl nuclear resuspension buffer (20 mM Tris-HCl pH 8.0, 75 mM NaCl, 0.5 mM EDTA, 0.85 mM DTT, 50% glycerol) by gentle flicking, then lysed by the addition of 100 μl nuclear lysis buffer (20 mM HEPES pH 7.5, 1 mM DTT, 7.5 mM MgCl2, 0.2 mM EDTA, 0.3 M NaCl, 1 M Urea, 1% NP-40), vortexed for 2 x 2 seconds, then incubated on ice for 3 min. Chromatin was pelleted by spinning at 14,000 rpm for 2 min at 4°C. The supernatant (nucleoplasm fraction) was removed, and the chromatin was rinsed once with PBS/1 mM EDTA. Chromatin was immediately dissolved in 100 μl PBS and 300 μl TRIzol Reagent (ThermoFisher). RNA was purified from chromatin pellets in TRIzol Reagent (ThermoFisher) using the RNeasy Mini kit (Qiagen) according to the manufacturer’s protocol, including the on-column DNase I digestion.
A DNA adapter (/5rApp/NNNNNCTGTAGGCACCATCAAT/3ddC/) was ligated to 3′ ends of nascent RNA using the T4 RNA ligase kit (NEB) by mixing 50 pmol adapter with 300-600 ng nascent RNA. Custom RT primers were used to add barcodes during reverse transcription with SSIII reverse transcriptase (ThermoFisher; Table S2). cDNA was amplified by 26 cycles of PCR using the Advantage 2 PCR Kit (Clontech), using custom gene-specific forward primers that were complementary to a unique region in the 5′UTR of the human HBB gene in combination with the kit IIA primer. PCR amplicons were cleaned up with a 2X volume of AMPure beads (Agencourt), and PacBio library preparation was performed at the Icahn School of Medicine at Mt. Sinai Genomics Core Facility using the SMRTbell Template Prep Kit 1.0 (Pacific Biosciences).

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Sequel

Description

nascent RNA

Data processing

For genome-wide LRS: Combined consensus sequence (CCS) reads were generated in FASTQ format, and Porechop was used to separate chimeric reads and trim external adapters with the SMRTer IIA sequences AAGCAGTGGTATCAACGCAGAGTAC and GTACTCTGCGTTGATACCACTGCTT with settings --extra_end_trim 0 --extra_middle_trim_good_side 0 --extra_middle_trim_bad_side 0 --min_split_read_size 100. Cutadapt was used to remove the unique 3′ end adapter on all reads in two rounds of filtering. First any reads with the adapter at the 3′ end were trimmer with settings -a CTGTAGGCACCATCAAT -e 0.1 -m 15 --untrimmed-output=untrimmed.fastq, and any reads which did not contain the full adapter were retained and their reverse complement was generated. Then, a second round of filtering with cutadapt using the settings -a CTGTAGGCACCATCAAT -e 0.1 -m 15 --discard-untrimmed was used to remove adapters from the reverse complement reads, and reads without the 3′ adapter were discarded. This ensures that each read contains a successfully ligated 3′ adapter which marks RNAPII position, and since sequencing occurs in both forward and reverse orientations randomly, it places all reads in the correct 5′ to 3′ orientation. Reads from the two adapter trimming steps were combined into a single file, then Prinseq-lite was used to remove PCR duplicates with settings -derep 1. Prinseq-lite was used again to trim 6 non-templated nucleotides added at the 5′ end by the strand-switching reverse transcriptase and the 5 nucleotides of the 3′ end adapter UMI with settings -trim_left 6 -trim_right 5. Reads were then mapped to the mm10 genome using minimap2 with settings -ax splice -uf -C5 --secondary=no, and the resulting SAM files were converted to BAM and BED files using samtools and bedtools. Reads overlapping the 7SK genomic region (chr9:78175302,78175633 in the mm10 genome) were filtered using samtools before all downstream analyses. Mapped reads in SAM format were filtered to remove reads that contained a polyA tail using a custom script (available at https://github.com/kirstenreimer/MEL_LRS). Briefly, mapped reads that had soft-clipped bases at the 3′ end were discarded if the soft-clipped region of the read contained 4 or more A’s and the fraction of A’s was greater than 0.9. Similarly, reads with soft-clipped bases at the 5′ end (resulting from minus strand reads) containing at least 4 T’s and having a fraction of T’s greater than 0.9 were discarded.
For HBB-targeted LRS: Porechop was used on raw FASTQ reads to remove external adapters and separate chimeric reads with the common forward sequence and the SMRTer IIA reverse sequence GACGTGTGCTCTTCCGATCT and GTACTCTGCGTTGATACCACTGCTT (as well as the reverse complement sequences) with settings --extra_end_trim 0 --extra_middle_trim_good_side 0 --extra_middle_trim_bad_side 0 --min_split_read_size 100 --middle_threshold 75. Reads were filtered and trimmed if they contained the 3′ end adapter as described above using the 3′ end adapter sequence plus the barcode sequence. Prinseq was used to demultiplex and trim reads as above, then cleaned FASTQ files were mapped to a custom annotation of the integrated HBB locus (available at https://github.com/kirstenreimer/MEL_LRS), which is based on the GLOBE vector (Miccio et al., 2008). Additional parameters were added to the above criteria for removing polyA-containing reads from targeted data mapped to the HBB locus based on empirical observation. Since the HBB locus is integrated randomly in the MEL genome, long readthrough transcripts that have coverage past the annotated HBB locus read into random genomic regions and cause long stretches of mismatched soft-clipped bases. A custom script was used to filter polyA-containing reads, but retain readthrough transcripts (available at https://github.com/kirstenreimer/MEL_LRS). Briefly, reads were discarded if: they contained a fraction of A’s or T’s greater than 0.7 in the soft-clipped region that starts past the end of the HBB locus annotation; they contained a fraction of A’s or T’s greater than 0.7 and 4 or more A’s or T’s in the soft-clipped region starting within 50 nucleotides of the annotated polyA site; they contained a stretch of soft-clipped reads greater than 20 nucleotides that starts within the annotated HBB gene.
For PRO-seq: Cutadapt was used to trim paired-end reads to 40 nt, removing adapter sequence and low quality 3′ ends, and discarding reads that were shorter than 20 nucleotides (-m20 -q 1). Additionally, in order to align reads using Bowtie, one nucleotide was removed from the 3′ end of all trimmed reads. Trimmed paired-end reads were first mapped to the dm3 reference genome using Bowtie, and subsequent uniquely mapped reads to the dm3 genome were used to determine percent spike-in return across all samples. Paired-end reads that failed to align to the dm3 genome were mapped to the mm10 reference genome. Read alignment to the dm3 and mm10 genomes were performed with settings - k1 -v2 –best -X1000 --un. SAM files were sorted using samtools. Read pairs uniquely aligned to the mm10 genome were separated, and strand-specific single nucleotide bedGraphs of the 3′ end mapping positions, corresponding to the biotinylated RNA 3′ end, were generated. Due to the “forward/reverse” orientation of Illuminia paired-end sequencing, “+” and “-“ stranded bedGraph files were switched at the end of the pipeline (Mahat et al., 2016). bedGraph files across replicates in each cell treatment were merged by summing the read counts per nucleotide position.
Genome_build: mm10 or custom HBB locus
Supplementary_files_format_and_content: [bedGraph files]: PROseq reads mapped to the mm10 genome containing combined counts of end 1 3' mapping locations for all replicates in a given induction condition.
Supplementary_files_format_and_content: [coSE_values_per_intron.txt]: coSE values in uninduced and induced conditions for introns in the mm10 genome which have coverage of at least 10 reads (spliced + unspliced).
Supplementary_files_format_and_content: [NIC_and_SSscore_per_intron.txt] NIC values and intron length, GC content, and splice site sequences for introns in the mm10 genome which have coverage of at least 10 reads (spliced + intermediates).

Submission date

Jan 24, 2020

Last update date

Jan 12, 2021

Contact name

Karla M. Neugebauer

E-mail(s)

karla.neugebauer@yale.edu

Organization name

Yale University

Department

Molecular Biophysics & Biochemistry