NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Series GSE94107 Query DataSets for GSE94107
Status Public on Feb 10, 2017
Title An RNA-seq dataset for studies of gene expression variation in the MAGIC line resource of Arabidopsis thaliana
Organism Arabidopsis thaliana
Experiment type Expression profiling by high throughput sequencing
Summary To understand the population genetics of structural variants (SVs), and their effects on phenotypes, we developed an approach to mapping SVs, particularly transpositions, segregating in a sequenced population, and which avoids calling SVs directly. The evidence for a potential SV at a locus is indicated by variation in the counts of short-reads that map anomalously to the locus. These SV traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between an SV trait at one locus and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3x) population sequence data from 488 recombinant inbred Arabidopsis genomes, we identified 6,502 segregating SVs. Remarkably, 25% of these were transpositions. Whilst many SVs cannot be delineated precisely, PCR validated 83% of 44 predicted transposition breakpoints. We show that specific SVs may be causative for quantitative trait loci for germination, fungal disease resistance and other phenotypes. Further we show that the phenotypic heritability attributable to sequence anomalies differs from, and in the case of time to germination and bolting, exceeds that due to standard genetic variation. Gene expression within SVs is also more likely to be silenced or dysregulated, as inferred from RNA-seq data collected from a subset of just over 200 of the MAGIC lines. This approach is generally applicable to large populations sequenced at low-coverage, and complements the prevalent strategy of SV discovery in fewer individuals sequenced at high coverage.
 
Overall design 209 samples consisting of different inbred lines from the Multiparent Advance Generation InterCross (MAGIC) population in the reference plant, Arabidopsis thaliana. For each sample, RNA was collected from the aerial shoot at the 4th true leaf stage, and Illumina mRNA-seq libraries were constructed (a single library was constructed with each line; that is, each MAGIC line is represented by one biological replicate). Using these libraries, which were non-stranded, paired-end 100 bp RNA-seq Illumina reads were generated for each sample, and used to quantify gene expresison in each MAGIC line. The resulting expression phenotypes are suitable for describing the impacts of genetic variation in the MAGIC line founders on the control of gene expression.
 
Contributor(s) Mott R, Clark RM, Raetsch G, Kahles A, Steffen J, Osborne EJ, Greenhalgh R
Citation(s) 28179367
Submission date Jan 26, 2017
Last update date May 15, 2019
Contact name Richard M Clark
Organization name University of Utah
Department Department of Biology
Lab Clark Laboratory
Street address 257 So. 1400 East, RM 204 SB
City Salt Lake City
State/province Utah
ZIP/Postal code 84112
Country USA
 
Platforms (1)
GPL13222 Illumina HiSeq 2000 (Arabidopsis thaliana)
Samples (209)
GSM2469127 MAGIC502
GSM2469128 MAGIC503
GSM2469129 MAGIC504
Relations
BioProject PRJNA368916
SRA SRP097877

Download family Format
SOFT formatted family file(s) SOFTHelp
MINiML formatted family file(s) MINiMLHelp
Series Matrix File(s) TXTHelp

Supplementary file Size Download File type/resource
GSE94107_FPKMs.tsv.gz 40.1 Mb (ftp)(http) TSV
GSE94107_RANKSUM.tsv.gz 23.1 Mb (ftp)(http) TSV
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap