NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE125279 Query DataSets for GSE125279
Status Public on Aug 27, 2019
Title miRWoods: enhanced precursor detection and stacked random forests for the sensitive detection of microRNAs.
Organisms Felis catus; Bos taurus
Experiment type Non-coding RNA profiling by high throughput sequencing
Summary MicroRNAs are conserved, endogenous small RNAs with critical post-transcriptional regulatory functions throughout eukaryota, including prominent roles in development and disease. Despite much effort, microRNA annotations still contain errors and are incomplete due especially to challenges related to identifying valid miRs that have small numbers of reads, to properly locating hairpin precursors and to balancing precision and recall. Here, we present miRWoods, which solves these challenges using a duplex-focused precursor detection method and stacked random forests with specialized layers to detect mature and precursor microRNAs, and has been tuned to optimize the harmonic mean of precision and recall. We trained and tuned our discovery pipeline on data sets from the well-annotated human genome, and evaluated its performance on data from mouse. Compared to existing approaches, miRWoods better identifies precursor spans, and can balance sensitivity and specificity for an overall greater prediction accuracy, recalling an average of 10% more annotated microRNAs, and correctly predicts substantially more microRNAs with only one read. We apply this method to the under-annotated genomes of Felis catus (domestic cat) and Bos taurus (cow). We identified hundreds of novel microRNAs in small RNA sequencing data sets from muscle and skin from cat, from 10 tissues from cow and also from human and mouse cells. Our novel predictions include a microRNA in an intron of tyrosine kinase 2 (TYK2) that is present in both cat and cow, as well as a family of mirtrons with two instances in the human genome. Our predictions support a more expanded miR-2284 family in the bovine genome, a larger mir-548 family in the human genome, and a larger let-7 family in the feline genome.
 
Overall design small RNA sequenciing from different tissues of Bos taurus and Felis catus
 
Contributor(s) Bell J, Hendrix D, Löhr C, Bionaz M
Citation(s) 31596843
Submission date Jan 17, 2019
Last update date Nov 26, 2019
Contact name David Anthony Hendrix
E-mail(s) david.hendrix@oregonstate.edu
Phone (541) 737-6224
Organization name Oregon State University
Department Biochemistry and Biophysics/EECS
Street address 2011 Ag & Life Sciences Bldg
City Corvallis
State/province OR
ZIP/Postal code 97331
Country USA
 
Platforms (2)
GPL21659 Illumina HiSeq 3000 (Bos taurus)
GPL26066 Illumina HiSeq 3000 (Felis catus)
Samples (14)
GSM3567566 steer1 boneMarrow
GSM3567567 steer1 dentalPulp
GSM3567568 steer1 eyeIris
Relations
BioProject PRJNA515735
SRA SRP180005

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE125279_bta_miRWoodsPredictions.gff.gz 37.6 Kb (ftp)(http) GFF
GSE125279_fca_miRWoodsPredictions.gff.gz 22.2 Kb (ftp)(http) GFF
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap