NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2482948 Query DataSets for GSM2482948
Status Public on Dec 15, 2017
Title Col x Cvi rep 1_Embryo
Sample type SRA
 
Source name Col x Cvi
Organism Arabidopsis thaliana
Characteristics genotype: Col female x Cvi male
developmental stage: 6 days after pollination
tissue: Embryo
Treatment protocol no treatment
Growth protocol Plants were grown in a greenhouse with 16-hour days at ~21 C. Flowers were emasculated 2 days before pollination. Seeds were dissected at six DAP (approximately torpedo stage of embryogenesis). A. lyrata plants were grown in a growth chamber with 16-hour days at 20 C. Seeds were dissected at ~15 days after pollination, corresponding to the late torpedo/walking stick stage of embryogenesis.
Extracted molecule total RNA
Extraction protocol Seeds from crosses between multiple individuals were hand-dissected and total embryo, endosperm, or seed coat RNA isolated using the Ambion RNAqueous Micro Kit.
The small RNA pool was separated from the larger RNA pool by sequential ethanol addition and filtering. The larger RNA pool was collected on filters after the addition of ethanol equivalent to 60% of the sample volume, and the small RNA pool was collected on filters after the addition of ethanol equivalent to 50% of the flowthrough volume from the larger RNA filter binding step. Libraries for Illumina sequencing were constructed using the NEXTflex Small RNA-Seq Kit v2 as directed (Bioo Scientific Corporation, Austin, Texas). 40 base single-end sequencing of sRNA libraries was performed on an Illumina HiSeq 2000 machine.
 
Library strategy ncRNA-Seq
Library source transcriptomic
Library selection size fractionation
Instrument model Illumina HiSeq 2000
 
Description sRNA-seq
Barcode: TGACCA
At_emb_v_endo_24nt_300bp_stats.txt
Data processing Analysis of sRNA libraries began with the trimming of low-quality read ends (fastq_quality_trimmer, -t 20 and -l 25; http://hannonlab.cshl.edu/fastx_toolkit/). Adapters were removed with cutadapt (-0 6 -m 26 ?discard-untrimmed).
sRNA libraries prepared using the NEXTflex sRNA-Seq Kit v2 are appended with 4 base randomized barcodes immediately 3' and 5' of the sRNA read itself. These barcodes were used to remove PCR duplicates (any read of the same sRNA sequence with identical flanking barcodes on both ends) before being removed themselves.
Reads were aligned using Bowtie 1.1.1 with two mismatches allowed (-v 2 --best; Langmead et al., 2009), using a metagenomic approach to reduce mapping bias. In the case of crosses between A. thaliana Col and Ler ecotypes, a metagenome was constructed using the TAIR10 genome and the published Ler genome (Gan et al.,Nature, 2011). In the case of crosses between Col and Cvi ecotypes, a metagenome was constructed using a Cvi pseudogenome (1bp indels and SNP substitutions introduced into TAIR10 base genome) and the TAIR10 genome. Following alignment, reads aligning to either parent genome were converted back to TAIR10 coordinates. For A. lyrata, reads were aligned in the same manner to the published MN47 genome (Hu et al., Nat Genet, 2011).
Reads were classified by strain using SNPs (Pignatta et al., eLife, 2014, Klosinska et al., Nat Plants, 2016). If classification at two SNP positions within the same read conflicted, that read was discarded. All reads that overlapped annotated tRNAs, snRNAs, rRNAs, or snoRNAs were removed.
Spike-ins consisting of sRNAs isolated from Candida albicans were added to A. thaliana libraries during preparation. Ultimately, spike-ins were not utilized in normalization, with library size-based normalization used instead. Reads that mapped to the C. albicans genome (either uniquely or in addition to mapping to the Arabidopsis thaliana genome) were removed from further analysis.
A single base read depth value was created for 24nt sRNAs ? each base of a read would contribute 1/24 of a read to an overlapping genomic position. Alignment files were run through bedtools genomecov (-d -scale: adjusted to account for individual library size; bedtools v. 2.23.0, bedtools.readthedocs.io/en/latest/index.html). Per-position values for all libraries pooled within a given analysis were averaged.
For *_emb_v_endo_24nt_300bp_stats.txt: 24nt sRNAs were binned into 300 base windows with 150 base overlaps, such that every position in the genome was covered by exactly 2 windows. Overlapping was performed using bedtools coverage (bedtools v. 2.23.0, bedtools.readthedocs.io/en/latest/index.html). sRNA window read values, classified as either embryo or endosperm, were used as input for DESeq2, with the sizeFactors command used to provide RPM normalization between libraries (Love et al., 2014). Output values were used to calculate average sRNA levels for all libraries at a given locus (mean log2 total counts) and the ratio of embryo expression to endosperm expression (log2 embryo/endosperm counts)
genome build: For A. thaliana libraries, TAIR10, resequenced Ler-0 genome (http://mtweb.cs.ucl.ac.uk/mus/www/19genomes/fasta/UNMASKED/ler_0.v7.fas), Cvi pseudogenome (TAIR10+Cvi1.fa.gz). For A. lyrata libraries, published MN47 genome (Hu et al., Nat Genet, 2011).
processed data files format and content: singlebasecompilation and singlebase.bed files: bed files detailing every position of the genome, with the fourth column providing a normalized sRNA read coverage value for all embryo, endosperm, or seed coat libraries in the study. For A. thaliana, all libraries of the same tissue type were combined. For A. lyrata, KarxMN47 replicates of the same tissue type were combined and for MN47xMN47 replicates of the same tissue type were combined. Prefix designations for A. lyrata are K=Kar, M=MN47, E=embryo, N=endosperm, S=seedcoat. Female in cross listed first. For MN47xMN47 endosperm, only replicates MMN2 and MMN3 were combined. MMN1 was not used.
processed data files format and content: *_emb_v_endo_24nt_300bp_stats.txt: tab delimited text file including window position, mean sRNA coverage for all libraries, endosperm libraries, and embryo libraries, the log2 fold change between endosperm and embryo mean values, and associated statistical quantities. At = A. thaliana, KM = A. lyrata Kar female x MN47 male, MM = A. lyrata MN47 female x MN47 male.
processed data files format and content: *_allele_specific_24_300bp_values_*reps.txt: tab deliminted text file including chr/scaffold, start, end and allele-specific 24nt sRNA read counts in 300 bp windows across the genome. CLN = A. thaliana Col female x Ler male endosperm, LCN = A. thaliana Ler female x Col male endosperm, CVN = A. thaliana Col female x Cvi male endosperm, VCN = A. thaliana Cvi female x Col male endosperm, KME = A. lyrata Kar female x MN47 male embryo, KMN = A. lyrata Kar female x MN47 male endosperm
 
Submission date Feb 10, 2017
Last update date May 15, 2019
Contact name Mary Gehring
Organization name Whitehead Institute for Biomedical Research
Street address 9 Cambridge Center
City Cambridge
State/province MA
ZIP/Postal code 02142
Country USA
 
Platform ID GPL13222
Series (2)
GSE94787 A small RNA pathway mediates allelic dosage in endosperm [sRNA-seq]
GSE94792 A small RNA pathway mediates allelic dosage in endosperm
Relations
BioSample SAMN06320729
SRA SRX2551875

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap