|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Sep 01, 2016 |
Title |
Animal11_Female_WholeBlood_Sequencing batch 2 |
Sample type |
SRA |
|
|
Source name |
female_WholeBlood
|
Organism |
Rattus norvegicus |
Characteristics |
strain: Sprague Dawley animal id: Animal11 age: 12-13 weeks gender: female tissue: WholeBlood rin: 7.6 mir_counts: 5016803
|
Growth protocol |
Naïve Sprague Dawley rats ,12-13 weeks old, were used for organ isolation.
|
Extracted molecule |
total RNA |
Extraction protocol |
RNA was purified using the miRNAeasy protocol from Qiagen (Kit cat# 217004). Library construction was performed according to Illumina’s TruSeq small RNA sample prep protocol (Cataloge # RS-200-0048) using a HiSeq2000. Briefly, RNA was extracted from tissues using the miRNeasy protocol. Libraries were constructed using the Illumina TruSeq small RNA preparation guide which entails ligation of adapters on the 3' and 5' ends of the RNA molecules. The libraries were quantitated using RiboGreen and were size selected through gel electrophoresis and libraries consistent with the size of miRNAs were gel purified. Sequencing was performed with each sample receiving 4-5 million, 30bp, single-end reads. generated
|
|
|
Library strategy |
miRNA-Seq |
Library source |
transcriptomic |
Library selection |
size fractionation |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
Sequencing batch 2 from sample Animal11_Female_WholeBlood US-1459951_BC2BUMACXX_US-1459951_CTCAGA_L005_R1_001
|
Data processing |
FastQ files were processed to remove adaptor sequence and discard reads < 17bp in length with cutadapt v. 1.4.1 [23] with options –a TGGAATTCTCGGGTGCCAAGG –quality-base 33 –q 20 –match-read-wildcards –m 17. All trimmed reads that contained an ‘N’ were discarded. Identical sequences from the same sample were combined into a single sequence in the form expected by miRDeep2 Quantifier.pl from the mirdeep2 package [v. 2.0.0.5 ] was used to generate miR alignment files (.mrd files) against known miRs from miRbase 20 with options –P –p <org>_hairpin.fa –m <org>_mature.fa –r trimmed_reads.fa –d. This was done separately for rat, mouse, human, and C. elegans known miRs. The .mrd files were then parsed with a custom Perl script as follows. Each isomiR sequence in an alignment was associated with the corresponding mature miR identifier. If it aligned to a miRNA precursor, but not with an expected mature miR sequence, it was identified as <miR>-pre to indicate this. Usually these correspond with reverse strand miRs that have not yet been annotated as alternate mature forms. Sometimes these appear to be microRNA offset RNAs [25] and sometimes they appear to represent incompletely processed sequence. If a given sequence was identified as aligning with more than one precursor, it was associated with all potential names as a composite name. That is, a sequence that aligned to both let-7a-5p and let-7f-5p was assigned to the composite mature miR let-7a-5p;let-7f-5p to indicate for subsequent analyses that this identification was ambiguous. All sequences that had not yet been identified against known rat miRs were then looked for in the analysis with respect to known mouse miRs, and so on against human and finally C. elegans miRs. This process allows for identification of conserved rodent miRs that have not yet been annotated in rat, and against conserved mammalian miRs that have not yet been annotated in rat and mouse. The comparison with C. elegans was to identify spiked in C. elegans miRs (not used in this data set). In parallel to the above identification of known miRs, we used miRDeep2’s novel miR identification process as well. We aligned all the unique trimmed fasta reads to the RN5 reference rat genome with miRDeep2’s mapper.pl script with default parameters and then used the miRDeep2.pl script to generate novel miR predictions and .mrd files. The .mrd files were parsed as above to assign miR names to each unique read. We then attempted to identify and merge novel predictions that were identical to known miRs as follows. If the same read sequence identified a miR in the novel analysis and in the known analysis and the known miR appeared in miRDeep’s .mrd file, we assumed that this was a duplicate identification and changed the miR name from that reported by miRDeep (chr# followed by a unique ID) to that of the known miR. Otherwise, we assumed that the novel prediction might be different from the known miR identification and added it to the list of potential identifications for the sequence. So, for example, a sequence which was aligned by miRDeep2 to a predicted miR on chrX and also identified as miR-450a is identified as “chrX_48156-5p;miR-450a-5p” because miRDeep2 does not show miR-450a as aligned to this predicted miR, indicating that these are two potentially distinctsources of this sequence. Finally, the counts associated with each unique sequence are summed for each named miR (including complex, multi-named miRs) for each sample and these are reported as the miR level counts. Genome_build: Rn5 for novel miR identification, mirbase 20 for known miR identification Supplementary_files_format_and_content: Tab delimited text files with raw count information at either the mature miR level or isomiR level. Both files also contain the source of the predictiion that the indicated counts go with the indicated miR or isomiR.
|
|
|
Submission date |
Feb 17, 2016 |
Last update date |
May 15, 2019 |
Contact name |
Aaron Thomas Smith |
E-mail(s) |
smithat@lilly.com
|
Phone |
3172774712
|
Organization name |
Eli Lilly
|
Department |
Investigative Toxicology
|
Street address |
Lilly Corporate Center
|
City |
Indianapolis |
State/province |
IN |
ZIP/Postal code |
46285 |
Country |
USA |
|
|
Platform ID |
GPL14844 |
Series (1) |
GSE78031 |
microRNA profiling of Sprague Dawley organs |
|
Relations |
SRA |
SRX1590027 |
BioSample |
SAMN04504524 |
Supplementary data files not provided |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|