NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM514985 Query DataSets for GSM514985
Status Public on May 20, 2010
Title Colon and lung cancer pool
Sample type SRA
 
Source name Colon neoplasia and lung neoplasia
Organism Homo sapiens
Characteristics tissue: colon tumors, lung tumors
Extracted molecule total RNA
Extraction protocol Total RNA was extracted twice from each sample of 23 human formalin-fixed paraffin-embedded (FFPE) samples derived from cancerous tissue. RNA was isolated using ten 10-µm-thick tissue sections using the miRdicatorTM extraction protocol developed at Rosetta Genomics. Briefly, the sample was incubated repeatedly in xylene at 57oC to remove excess paraffin, followed by washing in ethanol. Proteins were degraded by incubation in proteinase K solution at 45oC for a few hours. The RNA was extracted with acid phenol:chloroform followed by ethanol precipitation and DNAse digestion. Total RNA quantity and quality were checked by spectrophotometry (Nanodrop ND-1000). Pools of samples of the small RNA fraction within the total RNA were labeled and hybridized on arrays. After ensuring the presence and expression of more than 100 miRNAs per cancerous tissue pool, tissues were pooled together, resulting in a bladder+breast tumor pool and a colon+lung tumor pool. Array expression revealed the presence of 157 miRNAs from bladder cancer FFPEs, 260 miRNAs from breast cancer FFPEs, 135 miRNAs from lung cancer FFPEs, and 239 miRNAs from colon cancer FFPEs. Total RNA (75 µg) of seven different colon cancer FFPEs were pooled together with 75 µg of six different lung cancer FFPEs, while 75 µg total RNA of five different bladder cancer FFPEs were pooled together with 75 µg of five different breast cancer FFPEs.
 
Library strategy RNA-Seq
Library source transcriptomic
Library selection size fractionation
Instrument model 454 GS FLX
 
Description Pool2 consists of samples:
4143-Colon (adenocarcinoma)
4119-Colon (adenocarcinoma)
4120-Colon (adenocarcinoma)
4144-Colon (adenocarcinoma)
4145-Colon (adenocarcinoma)
4121-Colon (adenocarcinoma)
4122-Colon (adenocarcinoma)
4146-Colon (adenocarcinoma)
4147-Colon (adenocarcinoma)
4123-Colon (adenocarcinoma)
4124-Colon (adenocarcinoma)
4148-Colon (adenocarcinoma)
4149-Colon (adenocarcinoma)
4125-Colon (adenocarcinoma)
4130-Lung (non-small; adenocarcinoma)
4154-Lung (non-small; adenocarcinoma)
4155-Lung (neuroendocrine; small)
4131-Lung (neuroendocrine; small)
4132-Lung (non-small; large)
4156-Lung (non-small; large)
4157-Lung (non-small; squamous)
4133-Lung (non-small; squamous)
4134-Lung (non-small; squamous)
4158-Lung (non-small; squamous)
4160-Lung (non-small; squamous)
4136-Lung (non-small; squamous)
Data processing Adaptors were removed using a Perl script allowing internal polyN sequences within the adaptors and 1 mismatch. About 1000 sequences were removed since they were too short after adaptor removal (<10 bp). The sequences were mapped to the human genome (UCSC hg18 build) using BLAST, allowing maximum three bps mismatched to the genome and maximum insertion/deletion (indels) of three bps. For each aligned sequence the highest scoring hit was retrieved. All sequences with position overlap were clustered together using a Perl script. We assigned each genomic cluster of sequences the most abundant sequence in this cluster and demanded that for candidate miRNAs, the most abundant sequence will be mapped precisely to the genome (not allowing any mismatches/indels). The next step was to annotate known sequences. The following datasets were used for this task: RNA genes, sno/miRNA, RefSeq genes, and RepeatMasker tables were downloaded from the UCSC table browser {Karolchik, 2004 }, and known miRNA precursors were downloaded from miRBase in order to mark whether the sequence is part of a noncoding gene, a snoRNA, a protein-coding gene exon, a genomic repeat, or a known miRNA precursor, respectively. The sequences of the novel miRNA candidates were extended by several hundred bp within their chromosomes in order to predict possible miRNA precursors. An extended sequence was intended to predict the folding of a pri-miRNA that contains a hairpin-folded pre-miRNA. The candidate pri-miRNAs were folded using the Vienna package or mfold programs. All hairpin structures that had at least six base pairs, were at least 55 nucleotides long and had a loop not longer than 20 nucleotides were extracted from the minimum free energy fold of the predicted pri-miRNA (excluding overlapping hairpins). Each hairpin was assigned a Palgrade and conservation score as described before {Bentwich, 2005}. Predicted miRNA precursors have either Palgrade>0 meaning it has structural characteristics of known miRNA) or have absolute value of conservation score>0.9 (conserved in mammals). In addition, only sequences with ten or less genomic copies, with a length of 17-25 bp and a GC content in known miRNA range (15-90%) were chosen as miRNA candidates.
 
Submission date Feb 23, 2010
Last update date Jun 11, 2013
Contact name Einat Sitbon
Organization name Rosetta Genomics
Street address 10 Plaut St
City Rehovot
ZIP/Postal code 76706
Country Israel
 
Platform ID GPL9186
Series (2)
GSE20417 Discovery of microRNAs and other small RNAs in solid human tumors: sequencing
GSE20418 Discovery of microRNAs and other small RNAs in solid human tumors
Relations
BioSample SAMN02196191

Supplementary data files not provided
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap