|
|
GEO help: Mouse over screen elements for information. |
|
Status |
Public on Jan 19, 2018 |
Title |
Lib 2 viral RNA rep 2 |
Sample type |
SRA |
|
|
Source name |
viral RNA
|
Organism |
Human immunodeficiency virus 1 |
Characteristics |
library: Lib 2 population: viral RNA replicate: 2
|
Extracted molecule |
total RNA |
Extraction protocol |
Trizol extraction of RNA RNA was reverse transcribed into cDNA and a specific region of the RNA was PCRed into dsDNA. This dsDNA was then randomly fragmented before library construction for sequencing on the Illumina HiSeq2000 in 100 paired end read mode. Libraries were prepared by ligating preanneled Illumina multiplex adaptors P-GATCGGAAGAGCACACGTCT and ACACTCTTTCCCTACACGACGCTCTTCCGATCT to the blunted and A-tailed dsDNA. This was then amplified using primer PCR1.0 AATGATACGGCGACCACCGAGATCTACACTCTTTCCCTACACGACGCTCTTCCGATCT and an index containing primer CAAGCAGAAGACGGCATACGAGATNNNNNNGTGACTGGAGTTCAGACGTGTGCTCTTCCGATCT where NNNNNN is replaced with one of the Illumina TruSeq indexes
|
|
|
Library strategy |
OTHER |
Library source |
transcriptomic |
Library selection |
other |
Instrument model |
Illumina HiSeq 2000 |
|
|
Description |
viral RNA
|
Data processing |
Demultiplex reads using novobarcode (Novocraft) using parameters -F ILMFQ -l 6 Trim reads using trim galore! (Babraham Bioinformatics) with the parameters -q 30 --phred64 --paired Trimmed reads we aligned to the HIV-1 genome using novoalign (Novocraft) with the parameters -F STDFQ -o SAM -o SoftClip -r None Data processing was performed using a Python script using the Numpy, Pysam and Numba (Continuum Analytics) modules. Briefly, this script parses SAM files containing reads aligned to the reference sequence. It fetches alignment matches by decoding the CIGAR and reads are translated into sequence matches or mismatches in comparison to the reference sequence. It then merges match patterns derived from aligned paired-end reads removing redundancy and sequencing errors where overlapping paired reads disagree. Absolute nucleotide occurrences are counted using one of the three modes: The ‘1d’ mode counts the number of A, C, G, U at each position, the ‘2d’ mode is used for inferring the effect of single mutations on protein binding. It counts all non-redundant combinations of two positions (i,j); i ∈I,j ∈J;I={1…532}; J=I \j; and fills an array containing the number of dinucleotide occurrences for each pair of positions [number of AA,AC,…TG,TT]. Results are serialized to a binary .npy file. A separate script converts these numpy arrays into text files compatible with statistical tools written in MATLAB (MathWorks). Genome_build: HIV-1 NL4-3 Supplementary_files_format_and_content: 1D processed text file list genome position and the number of A, T, C, and G nucleotide found at that position. 2D list non redundant pairs of genome positions and the number of dinucleotides (AA, AT, AC…) found at these positions
|
|
|
Submission date |
Jan 18, 2018 |
Last update date |
Jan 24, 2018 |
Contact name |
Redmond P Smyth |
E-mail(s) |
r.smyth@ibmc-cnrs.unistra.fr
|
Organization name |
IBMC CNRS
|
Department |
Architecture et Réactivité de L'ARN
|
Lab |
Marquet-Paillart
|
Street address |
15 rue Rene Descartes
|
City |
Strasbourg |
State/province |
Alsace |
ZIP/Postal code |
67000 |
Country |
France |
|
|
Platform ID |
GPL20319 |
Series (1) |
GSE109386 |
In cell Mutational Interference Mapping Experiment (in cell MIME) identifies the 5’ PolyA signal as a dual regulator of HIV-1 genomic RNA production and packaging |
|
Relations |
BioSample |
SAMN08377372 |
SRA |
SRX3586729 |
Supplementary file |
Size |
Download |
File type/resource |
GSM2941969_18_1d.txt.gz |
6.6 Kb |
(ftp)(http) |
TXT |
GSM2941969_18_2d.txt.gz |
2.8 Mb |
(ftp)(http) |
TXT |
SRA Run Selector |
Raw data are available in SRA |
Processed data provided as supplementary file |
|
|
|
|
|