|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Dec 31, 2020 |
Title |
meCLIP_MDAMB231_rep2_IP |
Sample type |
SRA |
|
|
Source name |
breast cancer cell
|
Organism |
Homo sapiens |
Characteristics |
cell line: MDA-MB-231 cell type: breast cancer cell genotype: overexpressing lncRNA HOTAIR rip antibody: anti-m6A antibody (Abcam Ab151230)
|
Extracted molecule |
total RNA |
Extraction protocol |
Cells were cultured in appropriate media and total RNA was isolated using the Trizol (15596018, Invitrogen). Poly(A) selection was performed using the Magnosphere® Ultrapure mRNA Purification Kit (9186, Takara). RNA was fragemented at 70C for ~3min using RNA Fragmentation Reagents (AM8740, Ambion). A small amount (~500ng) of fragmented RNA was saved for use as the input sample. Library was prepared using eCLIP-seq protocol with minor modifications. Briefly, the fragmented poly(A) RNA was incubated with anti-m6A antibody (Abcam Ab151230) and then crosslinked twice at 150 mJ/cm2 (254nm wavelength) using a Stratalinker UV Crosslinker. The RNA:antibody sample was incubated with Protein A/G Magnetic Beads (88803, Pierce) overnight at 4C and then a RNA adapter is ligated on the 3' end of the sample. The RNA:antibody complex is then ran on an SDS-PAGE gel and transferred to a nitrocellulose membrane to remove any non-crosslinked RNA. Following treatment with Proteinase K to remove nearly all of the antibody except the crosslinked amino acid, the RNA is isolated, one adapter is ligated, and the RNA is converted into cDNA for full library amplification. Sequencing of the cDNA libraries was primarily performed using an Illumina NovaSEQ 6000 to generate 2x150bp paired-end runs consisting of 40 million raw reads per sample.
|
|
|
Library strategy |
RIP-Seq |
Library source |
transcriptomic |
Library selection |
other |
Instrument model |
Illumina NovaSeq 6000 |
|
|
Data processing |
The resulting reads are analyzed via a modified computational pipeline based on the original eCLIP strategy that has been converted into a Snakemake workflow. Briefly, the reads are initially inspected for appropriate quality using FastQC (v. 0.11.7). The in-line unique molecular identifier (UMI) located within the ssDNA adapter (rand3Tr3) at the beginning of read 2 is extracted using UMI-tools (v. 1.0.0) to prepare the reads for downstream de-duplication. The remaining non-random ssDNA adapter and indexed RNA adapters are then removed using Cutadapt (v. 2.4), with any reads less than 18bp being discarded. The trimmed reads are then briefly analyzed visually once more with FastQC to ensure all adapters are successfully removed. Two mapping steps are then performed using the splicing-aware RNA aligner STAR (v. 2.7.1a). First the reads are mapped to the species appropriate version of RepBase (v18.05) with any successfully mapped reads being removed from further analysis. The remaining reads are then mapped to the full species appropriate genome with only uniquely mapping reads being included in final alignment. Subsequent removal of PCR duplicates is performed with UMI-tools using the previously extracted UMIs. The final alignment file is sorted and indexed using samtools (v. 1.9) and variations from the reference genome are identified using the 'mpileup' command of samtools. An internally developed Java package is then employed to identify C-to-T mutations occurring within the m6A consensus motif at a set frequency threshold of greater than or equal to 2.5% and less than or equal to 50% of the total reads at a given position (with a minimum of 3 C-to-T mutations at a single site). The resulting m6A sites are then automatically compared to those identified in the corresponding input sample and any sites occurring in both are removed from the final list of m6As. Genome_build: hg19 Supplementary_files_format_and_content: XLS files consisting of a list of annoated m6As and corresponding BED files (categorized by confidence) were generate using custom Java package. Indexed BAM files consisting of reads (read2) that were successfully mapped to the genome are supplied for visualization of overall library quality and C-to-T conversion frequencies relative to identified m6A sites.
|
|
|
Submission date |
Dec 10, 2020 |
Last update date |
Jan 01, 2021 |
Contact name |
Justin Roberts |
E-mail(s) |
jtroberts@southalabama.edu
|
Organization name |
University of South Alabama
|
Department |
Pharmacology
|
Street address |
5851 USA Drive North, MSB 3370
|
City |
Mobile |
State/province |
AL |
ZIP/Postal code |
36688 |
Country |
USA |
|
|
Platform ID |
GPL24676 |
Series (1) |
GSE147440 |
Identification of m6A residues at single nucleotide resolution using eCLIP and an accessible custom analysis pipeline |
|
Relations |
BioSample |
SAMN17054668 |
SRA |
SRX9671710 |
Supplementary file |
Size |
Download |
File type/resource |
GSM4970397_meCLIP_MDAMB231_rep2.bam |
486.6 Mb |
(ftp)(http) |
BAM |
GSM4970397_meCLIP_MDAMB231_rep2.bam.bai.gz |
417.1 Kb |
(ftp)(http) |
BAI |
GSM4970397_meCLIP_MDAMB231_rep2.m6aList.annotated.xls.gz |
190.5 Kb |
(ftp)(http) |
XLS |
GSM4970397_meCLIP_MDAMB231_rep2.m6aList_highConfidence_9832.bed.gz |
138.9 Kb |
(ftp)(http) |
BED |
GSM4970397_meCLIP_MDAMB231_rep2.m6aList_lowConfidence_5053.bed.gz |
74.8 Kb |
(ftp)(http) |
BED |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|