GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM5942231

Query DataSets for GSM5942231

Status

Public on Jul 29, 2022

Title

Expt8_293TmiR124oe_input_total_rep2

Sample type

SRA

Source name

HEK293T cells

Organism

Homo sapiens

Characteristics

cell line: HEK293T

Treatment protocol

None

Extracted molecule

total RNA

Extraction protocol

For each 10cm plate (~15 million cells), cells were washed once with cold phosphate buffered saline, and then UV crosslinked (254 nm, 400 mJ/cm2) on ice. Cells were then spun down, supernatant removed, and washed with cold phosphate buffered saline. Cell pellets were flash frozen on dry ice and stored at -80°C.
Chimeric eCLIP was based off the previously described seCLIP protocol (Van Nostrand et al., 2016 & 2017) with modifications to enhance chimera formation described below. As in eCLIP, lysis was performed in eCLIP lysis buffer, followed by sonication and digestion with RNase I (Ambion). Immunoprecipitation of AGO2-RNA complexes was achieved with a primary mouse monoclonal Ago2 antibody (eIF2C2 (4F9) Santa Cruz, 4°C overnight) using magnetic beads pre-coupled to the secondary antibody (M-280 Sheep Anti-Mouse IgG Dynabeads, Thermo Fisher 11202D). Initial experiments used standard eCLIP conditions (10 ug of antibody and 125 uL of Dynabeads for 20×10^6 cells), but most experiments used decreased antibody and increased bead amounts based on the trend of decreased cross-species chimeras in those conditions (See Sup. Fig. 2X and Sup. Table X). Where indicated, 2% of each immunoprecipitated (IP) sample was saved as input control. For human/rat mixing experiments, cell pellets were lysed, sonicated, and RNase digested separately, and then mixed during addition of antibody and beads prior to overnight incubation. To phosphorylate the cleaved mRNA 5'-ends, beads were washed and treated with T4 polynucleotide kinase (PNK, 3' -phosphatase minus, NEB) and 1 mM ATP. Chimeric ligation was then performed on-bead at room temperature for one hour with T4 RNA Ligase I (NEB) and 1 mM ATP in a 150 µl total volume. As in seCLIP, samples were then dephosphorylated with alkaline phosphatase (FastAP, Thermo Fisher) and T4 PNK (NEB), and an RNA adapter was ligated to the 3′-ends of the mRNA fragments (T4 RNA Ligase, NEB). With-gel chimeric-eCLIP IP and input samples were then denatured with 1X NuPage buffer (Life Technologies) and DTT, run on 4%–12% Bis-Tris protein gels and transferred to nitrocellulose membranes. The region corresponding to bands at the appropriate Ago2 protein size plus 75 kDa was excised and treated with Proteinase K (NEB) to isolate RNA, which was column purified (Zymo). No-gel chimeric eCLIP samples were treated directly with Proteinase K (NEB) to isolate RNA and column purified (Zymo). For both methods, RNA was then reverse transcribed with SuperScript IV Reverse Transcriptase (Invitrogen), 3 mM manganese chloride (to encourage read-through of crosslink sites), and 0.1 M DTT. Following reverse transcription, samples were treated as in seCLIP, including treatment with ExoSAP-IT (Affymetrix) to remove excess oligonucleotides, hydrolysis with sodium hydroxide (to degrade RNA) and addition of hydrogen chloride (to balance pH). A 5’ Illumina DNA adapter (/5Phos/NNNNNNNNNNAGATCGGAAGAGCGTCGTGT/3SpC3) was then ligated to the 3′-end of cDNA fragments with T4 RNA Ligase (NEB), and after bead purification (Dynabeads MyOne Silane, Thermo Fisher), qPCR was performed on an aliquot of each sample to identify the proper number of PCR cycles. The remainder of the sample was PCR amplified with barcoded Illumina compatible primers (Q5, NEB) based on qPCR quantification and size selected using AMPure XP beads (Beckman). Libraries were quantified using Agilent TapeStation and sequenced on the Illumina HiSeq or NovaSeq platform.

Library strategy

OTHER

Library source

transcriptomic

Library selection

other

Instrument model

Illumina NovaSeq 6000

Description

At26
chimeric eCLIP

Data processing

Library strategy: eCLIPseq
Demultiplexing (IPs and inputs): Remove 5' UMIs from the sequencing reads and append to the read name for PCR deduplication further downstream. umi_tools extract –-stdin=R_Chi_2_1g.fastq.gz --bc-pattern=NNNNNNNNNN --log=R_Chi_2_1g.processed.log --stdout R_Chi_2_1g.umi.r1.fq.gz
Fastqc round 1: Run and examine by eye to make sure libraries look alright. fastqc R_Chi_2_1g.umi.r1.fq.gz
Cutadapt: Takes output from demultiplexed files. Run to trim off 3’ adapters. cutadapt --match-read-wildcards --times 3 -e 0.1 -O 1 --quality-cutoff 6 -m 18-a AGATCGGAAG -a GATCGGAAGA -a ATCGGAAGAG -a TCGGAAGAGC -a CGGAAGAGCA -a GGAAGAGCAC -a GAAGAGCACA -a AAGAGCACAC -a AGAGCACACG -a GAGCACACGT -a AGCACACGTC -a GCACACGTCT -a CACACGTCTG -a ACACGTCTGA -a CACGTCTGAA -a ACGTCTGAAC -a CGTCTGAACT -a GTCTGAACTC -a TCTGAACTCC -a CTGAACTCCA -a TGAACTCCAG -a GAACTCCAGT -a AACTCCAGTC -a ACTCCAGTCA -o R_Chi_2_1g.umi.r1.fqTr.gz R_Chi_2_1g.umi.r1.fq.gz > R_Chi_2_1g.umi.r1.fqTr.metrics
Trim 3' UMI: Trim the 9 or 10 nucleotide UMI from the 3' end of the read. cutadapt -u -9 -o R_Chi_2_1g.umi.r1.fqTrTr.fq.gz R_Chi_2_1g.umi.r1.fq.gz > R_Chi_2_1g.umi.r1.fqTrTr.metrics
Sort fastq: fastq-sort --id R_Chi_2_1g.umi.r1.fqTrTr.fq > R_Chi_2_1g.umi.r1.fqTrTr.sorted.fq
Convert fasta. seqtk seq -A R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.fq > R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.fa
Collapse: Collapse unique sequences for reverse Bowtie mapping. fasta2collapse.pl R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.fa R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.fa
Map miRNA to collapsed (unique) sequences. bowtie -a -e 35 -f -l 8 -n 1 -p 8 R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.bowtie_index mature.hsa.fa R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.tsv 2> R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.tsv.log
Select the "best match" positive strand alignments with the least number of mismatches. collapse_bowtie_results.py --bowtie_align R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.tsv --out_file R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.filtered.tsv
Uncollapse: Takes output from "best match" alignments. For each unique sequence that also maps to a miRNA, recover reads from uncollapsed fasta and identify the mappable mRNA component of these chimeric candidates. find_candidate_chimeric_seqs_from_mir_alignments.py --bowtie_align R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.filtered.tsv --fa_file R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.fa --metrics_file R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.filtered.tsv.metrics --out_file R_Chi_2_1g.umi.r1.fqTrTr.fq.sorted.collapsed.filtered.tsv.chimeric_candidates.fa
Trim miRNA: For pcr-enriched samples only, trim the miRNA primer sequence from the 5' end of the read. cutadapt -g file:0096_miRNA_W_S25.miRNA_seq.fa --match-read-wildcards -O 10 -m 18 -e 0.05 -o 0096_miRNA_W_S25.miRNATrim.fastq.gz 0096_miRNA_W_S25.umi.fqTrTr.fq.gz> 0096_miRNA_W_S25.miRNATrim.metrics
STAR rmRep: Takes output from trimming 3' UMIs (non-chimeric eCLIP), or from chimeric candidate sequences (chimeric eCLIP) or from trimming the miRNA primer for miR-486-enriched samples (pcr-enriched chimeric eCLIP). Maps to a repetitive element database which is used to remove repetitive elements, helps control for spurious artifacts from rRNA (& other) repetitive reads STAR --runMode alignReads --runThreadN 8 --genomeDir /path/to/repetitive/elements/genome/ --readFilesIn R_Chi_2_1g.adapterTrim.round2.fastq.gz --outSAMunmapped Within --outFilterMultimapNmax 30 --outFilterMultimapScoreRange 1 --outFileNamePrefix R_Chi_2_1g.adapterTrim.round2.rep.bam --outSAMattributes All --readFilesCommand gunzip -c --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > R_Chi_2_1g.adapterTrim.round2.rep.bam
Samtools view and count_aligned_from_sam: Takes output from STAR rmRep. Counts the number of reads mapping to each repetitive element samtools view R_Chi_2_1g.adapterTrim.round2.rep.bam | count_aligned_from_sam.py > R_Chi_2_1g.adapterTrim.round2.rmRep.metrics
Fastqc round 2: Takes output from STAR rmRep. Runs a second round of fastqc to verify that after read grooming the data still is usable fastqc R_Chi_2_1g.adapterTrim.round2.rep.bamUnmapped.out.mate1
STAR genome mapping: Takes output from STAR rmRep. Maps unique reads to the appropriate genome STAR --runMode alignReads --runThreadN 8 --genomeDir /path/to/genome/ --readFilesIn R_Chi_2_1g.adapterTrim.round2.rep.bamUnmapped.out.mate1 --outSAMunmapped Within --outFilterMultimapNmax 1 --outFilterMultimapScoreRange 1 --outFileNamePrefix R_Chi_2_1g.adapterTrim.round2.rmRep.bam --outSAMattributes All --outStd BAM_Unsorted --outSAMtype BAM Unsorted --outFilterType BySJout --outReadsUnmapped Fastx --outFilterScoreMin 10 --outSAMattrRGline ID:foo --alignEndsType EndToEnd > R_Chi_2_1g.adapterTrim.round2.rmRep.bam
Sort bam file: Takes output uniquely mapped genome data and sorts and indexes the results samtools sort R_Chi_2_1g.adapterTrim.round2.rmRep.bam -o R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.bam
Samtools index: Takes output from sort bam, makes bam index for use downstream samtools index R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.bam
Umi_tools dedup: Takes output from STAR genome mapping, performs PCR duplicate removal umi_tools dedup --method=unique -I R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.bam --log= R_Chi_2_1g.dedup.log -S R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.bam
Sort bam file: Takes output uniquely mapped human genome data and sorts and indexes the results samtools sort R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.bam -o R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam
Samtools index: Takes output from sort bam, makes bam index for use downstream samtools index R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam
Make_bigwig_files.py: Takes input from samtools view. Makes bw files to be uploaded to the genome browser or for other visualization python make_bigwig_files.py --bam R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam --genome /path/to/chrom/sizes/file --bw_pos R_Chi_2_1g.CombinedID.merged.r2.norm.pos.bw --bw_neg R_Chi_2_1g.CombinedID.merged.r2.norm.neg.bw
Clipper: Takes results from sort bam. Calls peaks on those files clipper -b R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam -s {species} -o R_Chi_2_1g .adapterTrim.round2.rmRep.sorted.rmDup.sorted.peaks.bed
Fix_scores.py: Takes input from clipper: Fixes p-values to be bed compatible python fix_scores.py --bed R_Chi_2_1g.adapterTrim.round2.rmRep.sorted.rmDup.sorted.peaks.bed --out_file R_Chi_2_1g.CombinedID.merged.r2.peaks.fixed.bed
bedToBigBed: Converts bed file to bigBed file for uploading to the genomeBrowser bedToBigBed R_Chi_2_1g.CombinedID.merged.r2.peaks.fixed.bed /path/to/chrom/sizes/file R_Chi_2_1g.CombinedID.merged.r2.peaks.fixed.bb -type=bed6+4
Peak normalization vs paired SMInput datasets is run as a second processing pipeline (Peak_input_normalization_wrapper.py). Input files for the normalization pipeline include .bam and .peak.bed files (generated through the pipeline above), as well as a manifest file pairing eCLIP datasets with their paired SMInput datasets as follows: uID \t RBP \t Cell line \t CLIP_rep1 \t CLIP_rep2 \t INPUT 001 RBP1 HepG2 /full/path/to/{sample_rep1}.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam /full/path/to/{sample_rep2}.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam /full/path/to/{sample_input}.adapterTrim.round2.rmRep.sorted.rmDup.sorted.bam
Assembly: UCSC version GRCh38/hg38 (human) and GRCm38/mm10 (mouse)
Supplementary files format and content: *.umi.r1.fqTrTr.fq.sorted.eclip.genome-mappedSoSo.rmDupSo.norm.pos.bw contains normalized RPM densities for non-chimeric eCLIP reads.
Supplementary files format and content: *.umi.r1.fqTrTr.fq.sorted.genome-mappedSoSo.rmDupSo.norm.*.bw correspond to normalized RPM densities for chimeric reads.
Supplementary files format and content: .normed.compressed.sorted.exclude-regions.bed files contain CLIPper peaks from all eCLIP reads normalized to size-matched input.
Supplementary files format and content: pcr_enriched.*.rmDupSo.peakClusters.bed files contain CLIPper peaks from PCR-enriched chimeric reads.

Submission date

Mar 09, 2022

Last update date

Jul 30, 2022

Contact name

Gene Yeo

E-mail(s)

geneyeo@ucsd.edu

Organization name

UCSD

Street address

2880 Torrey Pines Scenic Dr. Room 3805/Yeo Lab

City

La Jolla

State/province

ZIP/Postal code

92037

Country

USA

Platform ID

GPL24676

Series (2)

GSE198250	High depth profiling of miRNA targets with chimeric eCLIP [eCLIPseq]
GSE198251	High depth profiling of miRNA targets with chimeric eCLIP

Relations

BioSample

SAMN26542896

SRA

SRX14418864

Supplementary file	Size	Download	File type/resource
GSM5942231_Expt8_293TmiR124oe_input_total_rep2.sorted.eclip.genome-mappedSoSo.rmDupSo.norm.neg.bw	20.8 Mb	(ftp)(http)	BW
GSM5942231_Expt8_293TmiR124oe_input_total_rep2.sorted.eclip.genome-mappedSoSo.rmDupSo.norm.pos.bw	21.7 Mb	(ftp)(http)	BW
GSM5942231_Expt8_293TmiR124oe_input_total_rep2.sorted.genome-mappedSoSo.rmDupSo.norm.neg.bw	70.2 Kb	(ftp)(http)	BW
GSM5942231_Expt8_293TmiR124oe_input_total_rep2.sorted.genome-mappedSoSo.rmDupSo.norm.pos.bw	73.7 Kb	(ftp)(http)	BW
SRA Run Selector
Raw data are available in SRA
Processed data provided as supplementary file