|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Nov 14, 2012 |
Title |
K562 polyA RNA-Seq |
Organism |
Homo sapiens |
Experiment type |
Expression profiling by high throughput sequencing
|
Summary |
RNA-Seq reads and TopHat (Trapnell et al. Bioinformatics 2009) alignments of K562 cell-line transcriptome. These were used to validate the expression of short peptides idenitified by Mass-Spectrometry in K562 cells.
|
|
|
Overall design |
K562 polyA+ RNA (Batch 1) and total RNA (batch 2) was purchased from Ambion. We used oligo (dT)-selected polyA+ RNA to construct libraries for RNA-Seq.We then profiled the transcriptome of polyadenylated mRNA-Seq using Illumina sequencing platforms. We then used the sequenced reads to reconstruct the transcriptome using the Cufflinks de-novo assembler (Trapnell et al. Nat.Bio.Tech. 2010). Recent computational and ribosome profiling analyses suggest that many short open reading frames (sORFs) in eukaryotic genomes are translated. However, evidence that these sORFs produce stable polypeptides is lacking. Here we develop a strategy to discover and validate novel sORF-encoded polypeptides (SEPs) in human cells. In total, we detect 117 SEPs, 114 of which are novel, varying in length from 15 to 149 amino acids. Of these, 10 SEPs (0.5%) are derived from long intergenic non-coding RNAs (lincRNAs). We also observe the presence of polycistronic genes and the widespread use of non-AUG start codons, which is a phenomenon historically thought to be rare in the mammalian genome. Quantitative measurements reveal that SEPs can be found at concentrations between ~10-2000 copies per cell, which is within the range of typical cellular proteins. We confirm the translation of a number of these SEPs through heterologous expression of their encoding cDNAs. We also discover that several SEPs possess properties characteristic of functional proteins. These results demonstrate that human sORFs produce numerous stable polypeptides, revealing that the human proteome is larger and more diverse than previously appreciated.
|
|
|
Contributor(s) |
Cabili MN, Levin JZ, Mitchell A, Slavoff S, Schwaid A, Rinn JL, Saghatelian A |
Citation(s) |
23160002 |
|
Submission date |
Dec 27, 2011 |
Last update date |
May 15, 2019 |
Contact name |
Nataly Moran Cabili |
E-mail(s) |
nmcabili@fas.harvard.edu
|
Organization name |
Broad Institute ; Harvard University
|
Street address |
7 Cambridge Center
|
City |
Cambridge |
State/province |
MA |
ZIP/Postal code |
02142 |
Country |
USA |
|
|
Platforms (2) |
GPL9115 |
Illumina Genome Analyzer II (Homo sapiens) |
GPL11154 |
Illumina HiSeq 2000 (Homo sapiens) |
|
Samples (9)
|
|
Relations |
SRA |
SRP010061 |
BioProject |
PRJNA150405 |
Supplementary file |
Size |
Download |
File type/resource |
GSE34740_K562_align_b_1_GSM854404.bam |
3.5 Gb |
(ftp)(http) |
BAM |
GSE34740_K562_align_b_2_to_9.bam |
5.9 Gb |
(ftp)(http) |
BAM |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|