skip to main content

SRA Toolkit Documentation

Back to List of the Tools

Tool: sra-pileup

Usage:
sra-pileup [options] <path/file> [<path/file> ...]
sra-pileup [options] <accession>
Frequently Used Options:
General:
-h | --help Displays ALL options, general usage, and version information.
-V | --version Display the version of the program.
Data formatting:
-e | --seqname Use the original sequence-names (ex: chr1) instead of matching accessions.
--function count Sorted output with per-base counters (name, position, reference base, read depth, # of matches to reference, # of calls for each non-reference base, insertions, deletions, strand %).
--function mismatch Output only lines with a mismatch.
--minmismatch Minimum percent of mismatches used as the cutoff in “--function mismatch” (default is 5).
--function ref List the references used in the file.
Filtering:
-r | --aligned-region Filter by position on genome <name[:from-to]>. Name: The file specific reference name (ex: "chr1" or "1"). "from" and "to" are 1-based coordinates.
-q | --minmapq <min. mapq> Minimum mapping quality, alignments with a lower mapq will be ignored (default=0).
Workflow and piping:
--option-file <file> Give more options and positions from the file.
-o | --outfile <output-file> Output will be written to this file instead of std-out.
Use examples:
sra-pileup -r 1:559140-559160 SRR390728
Produces pileup stats for a given genomic region (-r), chromosome 1, bases 559140-559160 (1:559140-559160).
sra-pileup -o output.txt --gzip --option-file list.txt SRR390728
Writes pileup stats to a file called "output.txt" (-o output.txt) that is gzipped (--gzip), from a list of positions contained in the file "list.txt" (--option-file list.txt). The format of "list.txt" is:
  1. -r 1:559140-559141
  2. -r 1:569145-569145
  3. -r 1:579150-579150
For large numbers of positions, the memory usage of sra-pileup may be substantial so it is recommended to break your option-file into smaller chunks and script the output.
sra-pileup -e --function mismatch --minmismatch 1 -r 1:559140-559160 SRR390728
Produces pileup stats using the original sequence names (-e) and only for positions with a mismatch (--function mismatch) and at least 1% of reads are mismatched at that position (--minmismatch 1) for a given genomic region (-r), chromosome 1, bases 559140-559160 (1:559140-559160). Output columns are: reference sequence, 1-based position, coverage, number of mismatches
sra-pileup --function count -r chr10:143076-143080 SRR944105
Produces a per-base count (--function count) for a given genomic region (-r), chromosome 10, bases 143076-143080 (chr10:143076-143080) with output columns: seq-name, genomic-coordinate, reference base, read depth at position, # of reads that match reference, # of calls for each non-reference base, insertions, deletions, % of reads on forward strand.
Possible errors and their solution:
item not found while searching type within alignment module - ReferenceList_Find() failed
The tool is unable to resolve the base position or list of positions, check for typos, formatting errors and path to the option-file. Note that the chromosome name needs to match what was used in the originally submitted file (1 vs. chr1 for instance), this command will work: sra-pileup -r 1:559140-559160 -o pilestat.txt SRR390728 But this command will cause an error: sra-pileup -r chr1:559140-559160 -o pilestat.txt SRR390728 Reference names can be found by running SRA stat and looking for the "name=" under <AlignInfo>: sra-stat -x -q SRR390728
directory not found while opening manager within virtual file system module - failed to open "SRR*"
This error indicates that the .sra file cannot be found. Confirm that the path to the file is correct.
item unsupported while opening within application support module - failed to open "SRR*", it is not a vdb-database
This error indicates that the specified Run was not submitted as aligned data. Only data submitted as aligned (ie: bam format) can be analyzed by sra-pileup.
param unknown while parsing argument list within application support module - Unknown argument
This error indicates an unrecognized argument. Check your command line for typos, the problem option is usually indicated in quotes at the end of the error message.