NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2064721 Query DataSets for GSM2064721
Status Public on Sep 01, 2016
Title Animal6_Male_WholeBlood_Sequencing batch 2
Sample type SRA
 
Source name male_WholeBlood
Organism Rattus norvegicus
Characteristics strain: Sprague Dawley
animal id: Animal6
age: 12-13 weeks
gender: male
tissue: WholeBlood
rin: 8
mir_counts: 5530051
Growth protocol Naïve Sprague Dawley rats ,12-13 weeks old, were used for organ isolation.
Extracted molecule total RNA
Extraction protocol RNA was purified using the miRNAeasy protocol from Qiagen (Kit cat# 217004).
Library construction was performed according to Illumina’s TruSeq small RNA sample prep protocol (Cataloge # RS-200-0048) using a HiSeq2000. Briefly, RNA was extracted from tissues using the miRNeasy protocol. Libraries were constructed using the Illumina TruSeq small RNA preparation guide which entails ligation of adapters on the 3' and 5' ends of the RNA molecules. The libraries were quantitated using RiboGreen and were size selected through gel electrophoresis and libraries consistent with the size of miRNAs were gel purified. Sequencing was performed with each sample receiving 4-5 million, 30bp, single-end reads. generated
 
Library strategy miRNA-Seq
Library source transcriptomic
Library selection size fractionation
Instrument model Illumina HiSeq 2000
 
Description Sequencing batch 2 from sample Animal6_Male_WholeBlood
US-1459946_BC2BUMACXX_US-1459946_CAAAAG_L007_R1_001
Data processing FastQ files were processed to remove adaptor sequence and discard reads < 17bp in length with cutadapt v. 1.4.1 [23] with options –a TGGAATTCTCGGGTGCCAAGG –quality-base 33 –q 20 –match-read-wildcards –m 17. All trimmed reads that contained an ‘N’ were discarded. Identical sequences from the same sample were combined into a single sequence in the form expected by miRDeep2
Quantifier.pl from the mirdeep2 package [v. 2.0.0.5 ] was used to generate miR alignment files (.mrd files) against known miRs from miRbase 20 with options –P –p <org>_hairpin.fa –m <org>_mature.fa –r trimmed_reads.fa –d. This was done separately for rat, mouse, human, and C. elegans known miRs.
The .mrd files were then parsed with a custom Perl script as follows. Each isomiR sequence in an alignment was associated with the corresponding mature miR identifier. If it aligned to a miRNA precursor, but not with an expected mature miR sequence, it was identified as <miR>-pre to indicate this. Usually these correspond with reverse strand miRs that have not yet been annotated as alternate mature forms. Sometimes these appear to be microRNA offset RNAs [25] and sometimes they appear to represent incompletely processed sequence. If a given sequence was identified as aligning with more than one precursor, it was associated with all potential names as a composite name. That is, a sequence that aligned to both let-7a-5p and let-7f-5p was assigned to the composite mature miR let-7a-5p;let-7f-5p to indicate for subsequent analyses that this identification was ambiguous. All sequences that had not yet been identified against known rat miRs were then looked for in the analysis with respect to known mouse miRs, and so on against human and finally C. elegans miRs. This process allows for identification of conserved rodent miRs that have not yet been annotated in rat, and against conserved mammalian miRs that have not yet been annotated in rat and mouse. The comparison with C. elegans was to identify spiked in C. elegans miRs (not used in this data set).
In parallel to the above identification of known miRs, we used miRDeep2’s novel miR identification process as well. We aligned all the unique trimmed fasta reads to the RN5 reference rat genome with miRDeep2’s mapper.pl script with default parameters and then used the miRDeep2.pl script to generate novel miR predictions and .mrd files. The .mrd files were parsed as above to assign miR names to each unique read. We then attempted to identify and merge novel predictions that were identical to known miRs as follows. If the same read sequence identified a miR in the novel analysis and in the known analysis and the known miR appeared in miRDeep’s .mrd file, we assumed that this was a duplicate identification and changed the miR name from that reported by miRDeep (chr# followed by a unique ID) to that of the known miR. Otherwise, we assumed that the novel prediction might be different from the known miR identification and added it to the list of potential identifications for the sequence. So, for example, a sequence which was aligned by miRDeep2 to a predicted miR on chrX and also identified as miR-450a is identified as “chrX_48156-5p;miR-450a-5p” because miRDeep2 does not show miR-450a as aligned to this predicted miR, indicating that these are two potentially distinctsources of this sequence. Finally, the counts associated with each unique sequence are summed for each named miR (including complex, multi-named miRs) for each sample and these are reported as the miR level counts.
Genome_build: Rn5 for novel miR identification, mirbase 20 for known miR identification
Supplementary_files_format_and_content: Tab delimited text files with raw count information at either the mature miR level or isomiR level. Both files also contain the source of the predictiion that the indicated counts go with the indicated miR or isomiR.
 
Submission date Feb 17, 2016
Last update date May 15, 2019
Contact name Aaron Thomas Smith
E-mail(s) smithat@lilly.com
Phone 3172774712
Organization name Eli Lilly
Department Investigative Toxicology
Street address Lilly Corporate Center
City Indianapolis
State/province IN
ZIP/Postal code 46285
Country USA
 
Platform ID GPL14844
Series (1)
GSE78031 microRNA profiling of Sprague Dawley organs
Relations
SRA SRX1590029
BioSample SAMN04504526

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap