skip to main content

SRA Toolkit Documentation

Back to List of the Tools

Tool: vdb-dump

Usage:
vdb-dump [options] <path/file> [<path/file> ...]
vdb-dump [options] <accession>
VDB is the native format for SRA datafiles. vdb-dump provides a rapid, highly configurable output of SRA data that is best suited to uses like data re-formatting and custom scripting/programing.
Frequently Used Options:
General:
-h | --help Displays ALL options, general usage, and version information.
-V | --version Display the version of the program.
Data formatting:
-I | --row_id_on Print row id.
-N | --colname_off Do not print column-names.
-X | --in_hex Print numbers in hex.
-E | --table_enum Enumerates tables.
-T | --table <table(s)> Dumps only those tables included in a comma-separated list
-o | --column_enum_short Lists available columns in short form (recommended)
-O | --column_enum Lists columns in extended form
--phys Lists physical columns
-C | --columns <column(s)> Dumps only those columns included in a comma-separated list
-x | --exclude Exclude the specified columns
-R | --rows <rows> Dumps the specified rows
-M | --max_length <max_length> limits line length
-f | --format <format> dump format (csv,xml,json,piped,tab,fastq,fasta)
Workflow and piping:
--output-file write output to this file
--gzip compress output using gzip
--bzip2 compress output using bzip2
--output-buffer-size size of output-buffer, 0...none
--disable-multithreading disable multithreading
--option-file <file> Read more options and parameters from the file.
Use examples:
vdb-dump -f tab -C READ <accession> | awk '{print ">" "<accession>." NR "\n" $0}' > <accession>.fasta
Using core Linux utilities, produces a fasta format output. Note that this produces a single output file, even for paired-end data.
vdb-dump -f tab -C QUALITY <accession> | sed 's/,//g' | awk '{print ">" "<accession>." NR "n" $0}' > <accession>.qual
Using core Linux utilities, produces a qual format output. Note that this produces a single output file, even for paired-end data.
vdb-dump -T REFERENCE -C SEQ_ID -N <accession> | sort -u
vdb-dump output is formatted using the Linux utility 'sort' to produce a list of reference sequence names if the data are reference compressed.
perl ref-dump.pl <accession>
Perl script available here that calls vdb-dump to output reference sequence(s) in fasta format. Note that vdb-dump must be in the user’s $PATH in order for this script to function.
Possible errors and their solution:
…failed with curl-error 'CURLE_COULDNT_RESOLVE_HOST'…
The toolkit is attempting to contact or download data from NCBI, but is unable to connect. Please confirm that your computer or server has Internet connectivity.