NCBI Logo
GEO Logo
   NCBI > GEO > Accession DisplayHelp Not logged in | LoginHelp
GEO help: Mouse over screen elements for information.
          Go
Sample GSM2912623 Query DataSets for GSM2912623
Status Public on Mar 13, 2018
Title PTEN_PacBio_SMRT_Cell_3
Sample type SRA
 
Source name Synthetic sequence
Organism Escherichia coli
Characteristics cell type: Top10
Treatment protocol Cells harboring variant libraries, prepared as described above, were sorted using a FACSAria III into bins according to the abundance of their expressed, EGFP tagged variant. First, live, single, recombinant cells were selected using forward and side scatter, mCherry and mTagBFP2 signals. Then, a FITC:PE-Texas Red ratiometric parameter in the BD FACSDIVA software was created. A histogram of the FITC:PE-Texas Red ratio was created and gates dividing the library into four equally populated bins based on the ratio were established.
Growth protocol HEK 293T TetBxb1BFP cells were cultured in Dulbecco’s modified Eagle’s medium (DMEM) supplemented with 10% fetal bovine serum, 100 U/mL penicillin, 0.1 mg/mL streptomycin, and 2 μg/mL doxycycline.
Extracted molecule genomic DNA
Extraction protocol Cells were transferred into a microfuge tube, pelleted and stored at -20˚C. Genomic DNA was prepared using the GentraPrep kit (Qiagen) or Dneasy kit (Qiagen).
For TPMT: for each bin, all the purified DNA was spread over eight 25 uL PCR reactions containing Kapa Robust, primers GPS-landing-f (in the genome) and BC-GPS-P7-i#-UMI (3’ of the barcode) to tag the barcodes with a unique molecular index (UMI) and add a sample index. UMI-tagging PCR were performed using the following conditions: initial denaturation 95 ˚C 2 minutes, followed by three cycles of (95 ˚C 15 seconds, 60 ˚C 20 seconds, 72 ˚C 3 minutes). The eight PCR reactions were pooled and the PCR amplicon was purified using 1x Ampure XP (Beckman Coulter). To shorten the amplicon and add the p5 and p7 Illumina cluster-generating sequences, the UMI-tagged barcodes were then amplified with primers BC-TPMT-P5-v2 and Illumina p7. This PCR was performed with Kapa Robust and SYBR green II on a Bio-Rad mini-opticon qPCR machine, reactions were monitored and removed before saturation of the SYBR green II signal, at around 25 cycles. The amplicons were pooled and gel purified. For PTEN: . Eight 50 μL first-round PCR reactions were each prepared with a final concentration of ~50 ng/μL input genomic DNA, 1x Kapa HiFi ReadyMix, and 0.25 μM of the KAM499/JJS_501a primers. The reaction conditions were 95 °C for 5 minutes, 98 °C for 20 seconds, 60 °C for 15 seconds, 72 °C for 90 seconds, repeat 7 times, 72 °C for 2 minutes, 4 °C hold. Eight 50 μL reactions were combined, bound to AMPure XP (Beckman Coulter), cleaned, and eluted with 40 μL water. 40% of the eluted volume was mixed with 2x Kapa Robust ReadyMix; JJS_seq_F and one of the indexed reverse primers, JJS_seq_R1a through JJS_seq_R12a were added at 0.25 μM each. Reaction conditions for the second round PCR were 95 °C for 3 minutes, 95 °C for 15 seconds, 60 °C for 15 seconds, 72 °C for 30 seconds, repeat 14 times, 72 °C for 1 minutes, 4 °C hold. Amplicons were extracted after separation on a 1.5% TBE/agarose gel using a Quantum Prep Freeze ‘N Squeeze DNA Gel Extraction Kit (Bio-Rad).
 
Library strategy OTHER
Library source genomic
Library selection other
Instrument model PacBio RS II
 
Description plasmid DNA
Library and barcode cassette sequencing from DNA
Data processing Libraries were sequenced on NextSeq 500 instruments (Illumina) and base called using the instruments' Real Time Analysis software.
For TPMT, after converting from the bcl to fastq format using Illumina’s bcl2fastq version 2.18, a custom script was used to demultiplex the samples by index and call a consensus barcode from the read1 and read2 sequences. To collapse the barcode copies associated with unique UMIs, the UMI (bases 10-20 of the index read) were pasted onto the consensus barcode and unique combinations were identified (sort | uniq -c). The barcode from each unique barcode-UMI pair was used to populate a fastq file that could be used by the Enrich 2 software package to count variants. The barcode from each unique barcode-UMI pair was used to populate a fastq file that could be used by the Enrich 2 software package to count variants.
For PTEN, sequencing reads were converted to fastq format and de-multiplexed with bcl2fastq. Barcode paired sequencing reads for the plasmid library, as well as as PTEN VAMP-seq experiments 1 through 4, were joined using the fastq-join tool within the ea-utils package using the default parameters, whereas only one barcode read was collected for PTEN experiments 5 through 8. FASTQ files from these technical replicate amplification and sequencing runs were concatenated for analysis with Enrich2.
The supplied TPMT replicate bin-wise sorting data is the 15-base degenerate barcode followed by a space and the 10 base UMI.
The abundance scores for PTEN and TPMT were calculated as follows: The count for each variant in a bin was divided by the sum of counts recorded in that bin to obtain the frequency of each variant within that bin. This calculation was repeated for every bin in each replicate experiment. The summed counts of each variant in all four bins of an experiment was divided by the summed counts of all variants in all four bins of an experiment to obtain the total frequency value for each variant for each experiment. This total frequency value was used for filtering low-frequency variants out of the subsequent calculations. Next, a weighted average (w_ave) was calculated for each variant, with all weighted average values ranged from a value of 0.25 to 1. Finally, for each experiment, an abundance score for each variant was obtained by subjecting the weighted average of each variant to min-max normalization, using the weighted average value of WT, which was given a score of 1, and the median weighted average value for non-terminal nonsense variants at positions 51 through 349 for PTEN, or positions 51 through 219 for TPMT, which was given an abundance score of 0. The final abundance score for each variant was calculated by taking the mean of the min-max normalized abundance scores across the eight replicate experiments in which it could have been observed. Only variants which were scored in two or more replicate experiments were retained in the analysis. A standard error for each abundance score was calculated by dividing the standard deviation of the min-max normalized values for each variant by the square root of the number of replicate experiments in which it was observed. Lastly, the lower bound of the 95% confidence interval was calculated by multiplying the standard error with the 97.5 percentile value of a normal distribution and subtracting this product from the abundance score. The upper bound of the 95% confidence interval was calculated by instead adding the product with the abundance score.
genome build: NONE
processed data files format and content: The PTEN_barcodeInsertAssignments.tsv.gz and TPMT_barcodeInsertAssignments.tsv.gz processed data files are in gzip-compressed tab-separated text file format, containing the variant-barcode maps determined by subassembly. The TPMT_Barcode_UMI.tar file is a tar archive of gzipped text files containing the 15nt barcode attached to a 10nt unique molecular identifier (separated by a space), for each TPMT sorting bin for each experiment.
 
Submission date Jan 03, 2018
Last update date Mar 13, 2018
Contact name Kenneth Matreyek
E-mail(s) Kenneth.Matreyek@Case.edu
Organization name Case Western Reserve University
Department Pathology
Lab Matreyek
Street address 2103 Cornell Rd
City Cleveland
State/province OH
ZIP/Postal code 44106
Country USA
 
Platform ID GPL24462
Series (1)
GSE108727 Assessment of Variant Abundance by Massively Parallel Sequencing for PTEN and TPMT
Relations
BioSample SAMN08289372
SRA SRX3529255

Supplementary data files not provided
SRA Run SelectorHelp
Raw data are available in SRA
Processed data are available on Series record

| NLM | NIH | GEO Help | Disclaimer | Accessibility |
NCBI Home NCBI Search NCBI SiteMap