NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE34740 Query DataSets for GSE34740
Status Public on Nov 14, 2012
Title K562 polyA RNA-Seq
Organism Homo sapiens
Experiment type Expression profiling by high throughput sequencing
Summary RNA-Seq reads and TopHat (Trapnell et al. Bioinformatics 2009) alignments of K562 cell-line transcriptome. These were used to validate the expression of short peptides idenitified by Mass-Spectrometry in K562 cells.
 
Overall design K562 polyA+ RNA (Batch 1) and total RNA (batch 2) was purchased from Ambion. We used oligo (dT)-selected polyA+ RNA to construct libraries for RNA-Seq.We then profiled the transcriptome of polyadenylated mRNA-Seq using Illumina sequencing platforms. We then used the sequenced reads to reconstruct the transcriptome using the Cufflinks de-novo assembler (Trapnell et al. Nat.Bio.Tech. 2010). Recent computational and ribosome profiling analyses suggest that many short open reading frames (sORFs) in eukaryotic genomes are translated. However, evidence that these sORFs produce stable polypeptides is lacking. Here we develop a strategy to discover and validate novel sORF-encoded polypeptides (SEPs) in human cells. In total, we detect 117 SEPs, 114 of which are novel, varying in length from 15 to 149 amino acids. Of these, 10 SEPs (0.5%) are derived from long intergenic non-coding RNAs (lincRNAs). We also observe the presence of polycistronic genes and the widespread use of non-AUG start codons, which is a phenomenon historically thought to be rare in the mammalian genome. Quantitative measurements reveal that SEPs can be found at concentrations between ~10-2000 copies per cell, which is within the range of typical cellular proteins. We confirm the translation of a number of these SEPs through heterologous expression of their encoding cDNAs. We also discover that several SEPs possess properties characteristic of functional proteins. These results demonstrate that human sORFs produce numerous stable polypeptides, revealing that the human proteome is larger and more diverse than previously appreciated.
 
Contributor(s) Cabili MN, Levin JZ, Mitchell A, Slavoff S, Schwaid A, Rinn JL, Saghatelian A
Citation(s) 23160002
Submission date Dec 27, 2011
Last update date May 15, 2019
Contact name Nataly Moran Cabili
E-mail(s) nmcabili@fas.harvard.edu
Organization name Broad Institute ; Harvard University
Street address 7 Cambridge Center
City Cambridge
State/province MA
ZIP/Postal code 02142
Country USA
 
Platforms (2)
GPL9115 Illumina Genome Analyzer II (Homo sapiens)
GPL11154 Illumina HiSeq 2000 (Homo sapiens)
Samples (9)
GSM854403 K-562 std gel sized
GSM854404 K-562 truseq 10000ng
GSM854405 K-562 truseq 3000ng
Relations
SRA SRP010061
BioProject PRJNA150405

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE34740_K562_align_b_1_GSM854404.bam 3.5 Gb (ftp)(http) BAM
GSE34740_K562_align_b_2_to_9.bam 5.9 Gb (ftp)(http) BAM
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap