|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Feb 10, 2017 |
Title |
An RNA-seq dataset for studies of gene expression variation in the MAGIC line resource of Arabidopsis thaliana |
Organism |
Arabidopsis thaliana |
Experiment type |
Expression profiling by high throughput sequencing
|
Summary |
To understand the population genetics of structural variants (SVs), and their effects on phenotypes, we developed an approach to mapping SVs, particularly transpositions, segregating in a sequenced population, and which avoids calling SVs directly. The evidence for a potential SV at a locus is indicated by variation in the counts of short-reads that map anomalously to the locus. These SV traits are treated as quantitative traits and mapped genetically, analogously to a gene expression study. Association between an SV trait at one locus and genotypes at a distant locus indicate the origin and target of a transposition. Using ultra-low-coverage (0.3x) population sequence data from 488 recombinant inbred Arabidopsis genomes, we identified 6,502 segregating SVs. Remarkably, 25% of these were transpositions. Whilst many SVs cannot be delineated precisely, PCR validated 83% of 44 predicted transposition breakpoints. We show that specific SVs may be causative for quantitative trait loci for germination, fungal disease resistance and other phenotypes. Further we show that the phenotypic heritability attributable to sequence anomalies differs from, and in the case of time to germination and bolting, exceeds that due to standard genetic variation. Gene expression within SVs is also more likely to be silenced or dysregulated, as inferred from RNA-seq data collected from a subset of just over 200 of the MAGIC lines. This approach is generally applicable to large populations sequenced at low-coverage, and complements the prevalent strategy of SV discovery in fewer individuals sequenced at high coverage.
|
|
|
Overall design |
209 samples consisting of different inbred lines from the Multiparent Advance Generation InterCross (MAGIC) population in the reference plant, Arabidopsis thaliana. For each sample, RNA was collected from the aerial shoot at the 4th true leaf stage, and Illumina mRNA-seq libraries were constructed (a single library was constructed with each line; that is, each MAGIC line is represented by one biological replicate). Using these libraries, which were non-stranded, paired-end 100 bp RNA-seq Illumina reads were generated for each sample, and used to quantify gene expresison in each MAGIC line. The resulting expression phenotypes are suitable for describing the impacts of genetic variation in the MAGIC line founders on the control of gene expression.
|
|
|
Contributor(s) |
Mott R, Clark RM, Raetsch G, Kahles A, Steffen J, Osborne EJ, Greenhalgh R |
Citation(s) |
28179367 |
|
Submission date |
Jan 26, 2017 |
Last update date |
May 15, 2019 |
Contact name |
Richard M Clark |
Organization name |
University of Utah
|
Department |
Department of Biology
|
Lab |
Clark Laboratory
|
Street address |
257 So. 1400 East, RM 204 SB
|
City |
Salt Lake City |
State/province |
Utah |
ZIP/Postal code |
84112 |
Country |
USA |
|
|
Platforms (1) |
GPL13222 |
Illumina HiSeq 2000 (Arabidopsis thaliana) |
|
Samples (209)
|
|
Relations |
BioProject |
PRJNA368916 |
SRA |
SRP097877 |
Supplementary file |
Size |
Download |
File type/resource |
GSE94107_FPKMs.tsv.gz |
40.1 Mb |
(ftp)(http) |
TSV |
GSE94107_RANKSUM.tsv.gz |
23.1 Mb |
(ftp)(http) |
TSV |
SRA Run Selector |
Raw data are available in SRA |
Processed data are available on Series record |
|
|
|
|
|