GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM4315209

Query DataSets for GSM4315209

Status

Public on Nov 16, 2020

Title

IP1

Sample type

SRA

Source name

S2R+ cells

Organism

Drosophila melanogaster

Characteristics

condition: IP
miclip antibody: m6A antibody from synaptic systems (lot # 202003/2-82)

Treatment protocol

no specific treatment

Growth protocol

S2R+ cells were grown at room temperature in Schneider's medium (Gibco) supplemented with 10% FBS (Sigma) and 1% penicilin-streptomycin (sigma)

Extracted molecule

polyA RNA

Extraction protocol

RNA were harvested using Trizol reagent and mRNA were isolated by two rounds of binding to oligo (dt)25 magnetic beads (NEB). mRNA was fragmented with RNA fragmentation solution (Ambion). Samples IP1, IP2, IP3, IP4 and ctr were immunoprecipitated using anti-m6A antibody. Sample frag-mRNA was purified with RNAClean XP beads (Beckman Coulter).
Library preparation for IP1, IP2, IP3, IP4 and ctr samples was performed as previously described in (Sutandy et al., 2016). Libraries were sequenced on an Illumina NextSeq500. Library preparation for frag_mRNA was performed using the NEBNext Ultra Directional RNA Library Prep Kit.

Library strategy

RIP-Seq

Library source

transcriptomic

Library selection

other

Instrument model

Illumina NextSeq 500

Description

m6A CLIP rep1

Data processing

Individual samples were processed using the CLIP Tool Kit (CTK) v1.0.9. (Shah et al. 2017). We largely followed recommended user guide lines specific to CTK iCLIP data analysis as described here (https://zhanglab.c2b2.columbia.edu/index.php/ICLIP_data_analysis_using_CTK).
Briefly, 3’adapter sequences [AGATCGGAAGAGCGGTTCAG] were trimmed using cutadapt v1.8. [--overlap = 5 -m 29].
PCR duplicates were removed using a custom perl script, followed by the extraction of the 9 nucleotide miCLIP barcode and its addition to the read name.
All cDNA libraries were filtered for common Drosophila virus sequences using bowtie v1.1.2 [-p 4 -q (-X 1000) --fr --best].
Next, to avoid sequencing read alignment software biases, we decided to map sequencing reads to the Drosophila melanogaster dm6 genome assembly (ensemble v81) using novoalign, bwa and STAR v2.4.2a. For STAR alignments, we used a custom python script to transform sam alignment files into the expected format for downstream CITS identification. For STAR alignments, we did not consider spliced reads, soft-clipped reads, mismatches and indels near read start and read end, and reads with more than one indel or mismatch.
Then, unique tags were identified using parseAlignment.pl [-v --map-qual 1 --min-len 18 --indel-to-end 2] to extract unique tags, followed by read collapsing using tag2collapse.pl [-v -big --random-barcode -EM 30 --seq-error-model alignment -weight --weight-in-name --keep-max-score --keep-tag-name].
Crosslinking induced mutation sites (CIMS) indicative for the antibody-m6A interaction were identified running CIMS.pl [-big -n 10], and CIMS with FDR < 0.001 were retained.
Crosslinking induced truncations sites were identified using CITS.pl [-big -p 0.001 --gap 25]. Sites spanning more than 1 nucleotide were removed.
We further filtered identified CIMS and CITS to be reproducible in at least 2 out of 4 replicate m6A-immunoprecipitation samples and not present in identified CITS from the input control sample for each aligner separately. Moreover, CIMS were filtered to have a minimum of 6 unique tags [k > 5], at least three unique substitutions [m > 2], and be prevalent in less than 20% of the coverage [m/k < 0.2] to avoid calling homozygous and heterozygous single nucleotide variants. CIMS sites were found to be almost exclusive C-to-T conversions (n = 6225, 88% ± 5.8%) independent of the alignment software used (3 aligners n = 2677, 2 n = 2411, 1 n = 1137). For a stringent CITS set, we filtered CITS sites (n = 22917) that mostly truncated at A residues ( n = 11897, 52%), and were followed by C residues (CITS-AC n = 6799, 57%). Two thousand three hundred and two (37%) of C-to-T CIMS overlapped within a 1nt window to CITS-AC sites, suggesting that in many cases the same nucleotide was identified. Together, we considered a set of 13024 C-to-T conversions CIMS and AC truncation CITS across 2464 genes for our final S2 cell miCLIP data set. CITS site were annotated as described before (Wessels et al. 2019). For representation purposes we simplified the annotation categories. All CITS not annotated to 5’UTR, CDS, 3’UTR or intron were summarized in the category ‘other’. Enrichments were calculated relative to median feature proportions (5'UTR = 0.08 (131nt), CDS = 0.78 (1309.5nt), 3'UTR = 0.14 (234nt)) determined previously for S2 cells (Wessels et al. 2019). Enrichments for sites annotated as intronic or other were set to 1.
Alignment parameters:
Novoalign: [-t 85 -l 20 -s 1 -o Native -r None]
BWA: [-t 4 -n 0.06 -q 20]
STAR: [--alignEndsType EndToEnd --runThreadN 4 --outFilterMultimapNmax 10 --outSAMattributes All --outFilterIntronMotifs RemoveNoncanonical --outReadsUnmapped Fastx --alignSJoverhangMin 12 --outFilterMatchNmin 15 --outFilterMismatchNmax 1 --outFilterMismatchNoverLmax 0.05 --outFilterMultimapScoreRange 3 --alignIntronMax 20000 --seedMultimapNmax 200000 --seedPerReadNmax 30000 --outSAMtype BAM SortedByCoordinate]
Genome_build: dm6

Submission date

Feb 14, 2020

Last update date

Nov 16, 2020

Contact name

Jean-Yves Roignant

Organization name

Institute of molecular Biology

Lab

Roignant

Street address

Ackermannweg 4