(Submitter supplied) GNF1M We identified a non-redundant set of target sequences for the mouse using the following sources: RefSeq (12,029 sequences), Celera (29,331 sequences), and RIKEN (46,299 sequences). First, all sequences were RepeatMasked to remove repetitive elements. Next, sequence identity between individual sequences was established using pairwise BLAT or BLAST and sim4. The results from single-linkage clustering were further triaged to produce a final target set of 36,182 targets with the highest degree of confidence of computational prediction (biasing toward sequences containing Interpro domains and away from non-coding RNAs).
more...- Organism:
- Mus musculus
- 2 DataSets
- 9 Series
- 310 Samples
Download data: CDF, CIF, GIN, PROBE, PSI, SIF, TAB