GEO Accession viewer

NCBI > GEO > Accession Display

Not logged in | Login

GEO help: Mouse over screen elements for information.

Sample GSM3223669

Query DataSets for GSM3223669

Status

Public on Feb 13, 2020

Title

XY_e7GW_p1_EH975

Sample type

SRA

Source name

Testis

Organism

Homo sapiens

Characteristics

sexe: Male
tissue: Testis
age: 7GW+1d

Extracted molecule

total RNA

Extraction protocol

Total RNA were extracted from tissues using the RNeasy mini Kit (Qiagen), quantified using a NanoDrop™ 8000 spectrophotometer (Thermo Scientific) and quality controlled using a 2100 Electrophoresis Bioanalyzer (Agilent).
Libraries of template molecules suitable for strand specific high throughput DNA sequencing were created using “TruSeq Stranded Total RNA with Ribo-Zero Gold Prep Kit” (catalog # RS-122-2301, Illumina Inc.). Briefly, the removal of cytoplasmic and mitochondrial ribosomal RNA (rRNA) was performed from 500 ng of total RNA using biotinylated, target-specific oligos combined with Ribo-Zero rRNA removal beads. Following purification, the RNA was fragmented using divalent cations under elevated temperature. The cleaved RNA fragments were copied into first strand cDNA using reverse transcriptase and random primers, followed by second strand cDNA synthesis using DNA Polymerase I and RNase H. The double stranded cDNA fragments were blunted using T4 DNA polymerase, Klenow DNA polymerase and T4 PNK. A single ‘A’ nucleotide was added to the 3’ ends of the blunt DNA fragments using a Klenow fragment (3' to 5'exo minus) enzyme. The cDNA fragments were ligated to double stranded adapters using T4 DNA Ligase. The ligated products were enriched by PCR amplification (30 sec at 98°C; [10 sec at 98°C, 30 sec at 60°C, 30 sec at 72°C] x 12 cycles; 5 min at 72°C). Excess of PCR primers was removed by purification using AMPure XP beads (Agencourt Biosciences Corporation). Final cDNA libraries were quality-checked and quantified using a 2100 Electrophoresis Bioanalyzer (Agilent). The libraries were loaded in the flow cell at 7pM concentration and clusters were generated in the Cbot and sequenced in the Illumina Hiseq 2500 as paired-end 2x50 base reads following Illumina's instructions. Image analysis and base calling were performed using RTA 1.17.20 and CASAVA 1.8.2.

Library strategy

RNA-Seq

Library source

transcriptomic

Library selection

cDNA

Instrument model

Illumina HiSeq 2500

Description

Whole organ

Data processing

Base calling. Image analysis and base calling were performed using RTA 1.17.20 and CASAVA 1.8.2.
Assembly of a unique set of human reference transcripts. Ensembl (Cunningham, et al., 2015) and RefSeq (Brown, et al., 2015; Pruitt, et al., 2014) transcript annotations of the hg19 release of the human genome were downloaded from the University of California Santa Cruz (UCSC) genome browser website (Rosenbloom, et al., 2015) in June 30th, 2015. Both transcript annotation files (GTF format) were subsequently merged into a combined set of non-redundant human reference transcripts (HRT) using Cuffcompare (Pollier, et al., 2013). We also defined a non-redundant dataset of human splice junctions (HSJ) extracted from alignments of human transcripts and expressed sequence tags (ESTs) provided by UCSC.
Mapping reads. RNA-seq-derived reads from each sample replicate were aligned independently to the the hg19 release of the human genome sequence with TopHat (version 2.0.10) (Trapnell, et al., 2009) using previously published approaches (Chalmel, et al., 2014; Pauli, et al., 2012; Trapnell, et al., 2012; Zimmermann, et al., 2015). Briefly TopHat program was run a first time for each RNA-seq fastq file using the HRT and HSJ datasets to improve read mapping. The resulting junction outputs produced by all TopHat runs were pooled and added to the HSJ dataset. TopHat was rerun a second time for each sample using the new HSJ dataset. The output of this second run comprised the final alignment (BAM format). Finally, BAM files corresponding to the same testis or ovary time of development were subsequently merged and sorted with the samtools suite (Li, et al., 2009).
Transcriptome assembly and quantification. The transcriptome of each gonad was subsequently assembled, compared to known transcript annotation and quantified with the Cufflinks suite using default settings (Pollier, et al., 2013). Briefly, the assembly step that was performed by Cufflinks using the merged alignment files yielded a set of ~93000-192000 transcript fragments (transfrags) for each gonad. The Cuffcompare program was then used: (i) to define a non-redundant set of 180,242 assembled transcripts by tracking Cufflinks transfrags across all experiments; and, (ii) to compare the resulting transcripts to the HRT dataset (i.e. known transcript annotation). Finally, the abundance of each transcript in each experiment was assessed using Cuffdiff in fragments per kilobase of exon model per million reads mapped (FPKM). Abundance values were normalized using Cuffnorm to reduce systematic effects and to allow direct comparison between the individual samples.
Refinement of assembled transcripts. As suggested by (Chalmel, et al., 2014; Prensner, et al., 2011), we sequentially applied four filtering steps to eliminate poor-quality quantifications and identify the most robust transfrags from background signal. First, we selected 60,437 “detectable” or “expressed” transfrags defined as those for which abundance levels were above 1 FPKM in at least one sample. We next selected 60,136 transcripts with a cumulative exon length ≥200 nt. Third, all transfrags that were not automatically annotated by Cuffcompare as complete match (Cuffcompare class “=”), potentially novel isoform (“j”), unknown intronic (“i”, i.e. loci falling entirely within a reference intron and without exon-exon overlap with another known locus), intergenic (“u”) or antisense (“x”) isoforms were discarded. Finally, all transcript fragments that were annotated as either novel isoforms or novel genes (class codes “j”, “i”, “u” or “x”) and that did not harbor at least two exons (multi-exonic) were filtered out. Together, this strategy produced a high-confidence set of 35,194 transcripts fulfilling these refinement conditions and supporting total RNA molecules expressed testis or ovary development.
Genome_build: hg19
Supplementary_files_format_and_content: .bedgraph files contain genome-wide read coverage statistics, in bedGraph format, obtained with the "genomeCoverageBed" utility in BEDTools (release 2.12.0); .gtf file contains the reconstructed transcripts obtained with cufflinks and subsequently classified with cuffcompare in the cufflinks suite of tools (release 2.2.1); .txt file contains the FPKM expression levels quantified with Cufflinks for each transfrag and each sample; the columns are tab-separated.

Submission date

Jun 26, 2018

Last update date

Feb 13, 2020

Contact name

Frédéric Chalmel

E-mail(s)

frederic.chalmel@inserm.fr

Organization name

Inserm U1085-Irset

Department

Physiology and physiopathology of the urogenital tract

Street address

9 avenue du Pr. Léon Bernard

City

Rennes

State/province

France

ZIP/Postal code

35000

Country

France

Platform ID

GPL16791

Series (1)

GSE116278

Dynamics of the transcriptional landscape during human gonad development during fetal life

Relations

BioSample

SAMN09495482

Supplementary file	Size	Download	File type/resource
GSM3223669_NNRD15_S2.accepted_hits.bam.bed.gz	272.2 Mb	(ftp)(http)	BED
Raw data are available in SRA
Processed data provided as supplementary file
Processed data are available on Series record