NCBI C++ ToolKit
Functions
writedb_convert.hpp File Reference

Data conversion tools for CWriteDB and associated code. More...

#include <objects/seq/seq__.hpp>
#include <objects/blastdb/blastdb__.hpp>
#include <objmgr/bioseq_handle.hpp>
#include <objmgr/seq_vector.hpp>
+ Include dependency graph for writedb_convert.hpp:
+ This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Go to the SVN repository for this file.

Functions

 USING_SCOPE (objects)
 Import definitions from the objects namespace. More...
 
void WriteDB_StdaaToBinary (const CSeq_inst &si, string &seq)
 Build blast db protein format from Stdaa protein Seq-inst. More...
 
void WriteDB_EaaToBinary (const CSeq_inst &si, string &seq)
 Build blast db protein format from Eaa protein Seq-inst. More...
 
void WriteDB_IupacaaToBinary (const CSeq_inst &si, string &seq)
 Build blast db protein format from Iupacaa protein Seq-inst. More...
 
void WriteDB_Ncbi2naToBinary (const CSeq_inst &si, string &seq)
 Build blast db nucleotide format from Ncbi2na Seq-inst. More...
 
void WriteDB_Ncbi4naToBinary (const CSeq_inst &seqinst, string &seq, string &amb)
 Build blast db nucleotide format from Ncbi4na Seq-inst. More...
 
void WriteDB_Ncbi4naToBinary (const char *ncbi4na, int byte_length, int base_length, string &seq, string &amb)
 Build binary blast2na + ambig encoding based on ncbi4na input. More...
 
void WriteDB_IupacnaToBinary (const CSeq_inst &si, string &seq, string &amb)
 Build blast db nucleotide format from Iupacna Seq-inst. More...
 
void s_AppendInt4 (string &outp, int x)
 Append a value to a string as a 4 byte big-endian integer. More...
 
void s_WriteInt4 (ostream &str, int x)
 Write a four byte integer to a stream in big endian format. More...
 
void s_WriteInt8LE (ostream &str, Uint8 x)
 Write an eight byte integer to a stream in little-endian format. More...
 
void s_WriteInt8BE (ostream &str, Uint8 x)
 Write an eight byte integer to a stream in big-endian format. More...
 
void s_WriteString (ostream &str, const string &s)
 Write a length-prefixed string to a stream. More...
 

Detailed Description

Data conversion tools for CWriteDB and associated code.

Defines classes: CAmbiguousRegion

Implemented for: UNIX, MS-Windows

Definition in file writedb_convert.hpp.

Function Documentation

◆ s_AppendInt4()

void s_AppendInt4 ( string outp,
int  x 
)
inline

Append a value to a string as a 4 byte big-endian integer.

Parameters
xValue to append.
outpString to modify.

Definition at line 143 of file writedb_convert.hpp.

References buf.

Referenced by CAmbigDataBuilder::GetAmbig(), and CAmbigDataBuilder::x_PackNewAmbig().

◆ s_WriteInt4()

void s_WriteInt4 ( ostream &  str,
int  x 
)
inline

Write a four byte integer to a stream in big endian format.

Parameters
strStream to write to.
xInteger to write.

Definition at line 157 of file writedb_convert.hpp.

References buf, and str().

Referenced by s_WriteString(), CBinaryListBuilder::Write(), CWriteDB_File::WriteInt4(), and CWriteDB_IndexFile::x_Flush().

◆ s_WriteInt8BE()

void s_WriteInt8BE ( ostream &  str,
Uint8  x 
)
inline

Write an eight byte integer to a stream in big-endian format.

Parameters
strStream to write to.
xInteger to write.

Definition at line 189 of file writedb_convert.hpp.

References buf, and str().

Referenced by CBinaryListBuilder::Write(), and CWriteDB_File::WriteInt8().

◆ s_WriteInt8LE()

void s_WriteInt8LE ( ostream &  str,
Uint8  x 
)
inline

Write an eight byte integer to a stream in little-endian format.

Parameters
strStream to write to.
xInteger to write.

Definition at line 171 of file writedb_convert.hpp.

References buf, and str().

Referenced by CWriteDB_IndexFile::x_Flush().

◆ s_WriteString()

void s_WriteString ( ostream &  str,
const string s 
)
inline

Write a length-prefixed string to a stream.

This method writes a string to a stream, prefixing the string with it's length, written as a big-endian four byte integer.

Parameters
strStream to write to.
sString to write.

Definition at line 211 of file writedb_convert.hpp.

References s_WriteInt4(), and str().

Referenced by CWriteDB_IndexFile::x_Flush().

◆ USING_SCOPE()

USING_SCOPE ( objects  )

Import definitions from the objects namespace.

◆ WriteDB_EaaToBinary()

void WriteDB_EaaToBinary ( const CSeq_inst si,
string seq 
)

Build blast db protein format from Eaa protein Seq-inst.

The data is converted and returned in the string.

Parameters
siSeq-inst containing data in NcbiEaa format. [in]
seqSequence in blast db disk format. [out]

Definition at line 539 of file writedb_convert.cpp.

References _ASSERT, CSeqConvert::Convert(), CSeqUtil::e_Ncbieaa, CSeqUtil::e_Ncbistdaa, and si.

Referenced by CWriteDB_Impl::x_CookSequence().

◆ WriteDB_IupacaaToBinary()

void WriteDB_IupacaaToBinary ( const CSeq_inst si,
string seq 
)

Build blast db protein format from Iupacaa protein Seq-inst.

The data is converted and returned in the string.

Parameters
siSeq-inst containing data in Iupacaa format. [in]
seqSequence in blast db disk format. [out]

Definition at line 554 of file writedb_convert.cpp.

References _ASSERT, CSeqConvert::Convert(), CSeqUtil::e_Iupacaa, CSeqUtil::e_Ncbistdaa, and si.

Referenced by CWriteDB_Impl::x_CookSequence().

◆ WriteDB_IupacnaToBinary()

void WriteDB_IupacnaToBinary ( const CSeq_inst si,
string seq,
string amb 
)

Build blast db nucleotide format from Iupacna Seq-inst.

The data is compressed to ncbi2na, the length remainder is coded into the last byte, and ambiguous region data is produced.

Parameters
siSeq-inst containing data in Iupacna format. [in]
seqSequence in blast db disk format. [out]
ambAmbiguities in blast db disk format. [out]

Definition at line 590 of file writedb_convert.cpp.

References _ASSERT, CSeqConvert::Convert(), CSeqUtil::e_Iupacna, CSeqUtil::e_Ncbi4na, si, tmp, and WriteDB_Ncbi4naToBinary().

Referenced by CWriteDB_Impl::x_CookSequence().

◆ WriteDB_Ncbi2naToBinary()

void WriteDB_Ncbi2naToBinary ( const CSeq_inst si,
string seq 
)

Build blast db nucleotide format from Ncbi2na Seq-inst.

The data is in the correct format, and can be copied as-is, but the length remainder must be coded into the last byte. It is not necessary to deal with ambiguities - if there were any, ncbi2na would not be the input format.

Parameters
siSeq-inst containing data in Iupacaa format. [in]
seqSequence in blast db disk format. [out]

Definition at line 569 of file writedb_convert.cpp.

References _ASSERT, base_length, s_DivideRoundUp(), and si.

Referenced by CWriteDB_Impl::x_CookSequence().

◆ WriteDB_Ncbi4naToBinary() [1/2]

void WriteDB_Ncbi4naToBinary ( const char *  ncbi4na,
int  byte_length,
int  base_length,
string seq,
string amb 
)

Build binary blast2na + ambig encoding based on ncbi4na input.

Parameters
ncbi4naInput data with possible ambiguities.
byte_lengthNumber of bytes in the input data.
base_lengthValid nucleotide bases in the input data.
seqSequence data in blast db format.
ambAmbiguity data in blast db format. Build blast db nucleotide format from Ncbi4na data in memory.

For a given sequence in ncbi4na format, the blast database format data is constructed; this consists of ncbi2na format with values in ambiguous locations selected randomly, plus the precise values of the ambiguous regions encoded in a seperate string.

Parameters
ncbi4naPointer to Ncbi4na format sequence data. [in]
byte_lengthLength of ncbi4na data in bytes. [in]
base_lengthNumber of letters of valid data. [in]
seqSequence in blast db disk format. [out]
seqAmbiguities in blast db disk format. [out]

Definition at line 444 of file writedb_convert.cpp.

References _ASSERT, base_length, CAmbigDataBuilder::Check(), ctable, CAmbigDataBuilder::GetAmbig(), i, s_BuildNa4ToNa2Table(), and s_DivideRoundUp().

Referenced by WriteDB_IupacnaToBinary(), and WriteDB_Ncbi4naToBinary().

◆ WriteDB_Ncbi4naToBinary() [2/2]

void WriteDB_Ncbi4naToBinary ( const CSeq_inst seqinst,
string seq,
string amb 
)

Build blast db nucleotide format from Ncbi4na Seq-inst.

The data is compressed to ncbi2na, the length remainder is coded into the last byte, and ambiguous region data is produced.

Parameters
siSeq-inst containing data in Ncbi4na format. [in]
seqSequence in blast db disk format. [out]
ambAmbiguities in blast db disk format. [out]

Definition at line 520 of file writedb_convert.cpp.

References base_length, CAliasBase< TPrim >::Get(), CSeq_inst_Base::GetLength(), CSeq_data_Base::GetNcbi4na(), CSeq_inst_Base::GetSeq_data(), and WriteDB_Ncbi4naToBinary().

Referenced by CWriteDB_Impl::x_CookSequence().

◆ WriteDB_StdaaToBinary()

void WriteDB_StdaaToBinary ( const CSeq_inst si,
string seq 
)

Build blast db protein format from Stdaa protein Seq-inst.

No conversion is actually done here, because this is already the correct format for disk. Instead the sequence data is just copied from the Seq-inst to the string.

Parameters
siSeq-inst containing data in NcbiStdaa format. [in]
seqSequence in blast db disk format. [out]

Definition at line 530 of file writedb_convert.cpp.

References _ASSERT, and si.

Referenced by CWriteDB_Impl::x_CookSequence().

Modified on Fri Sep 20 14:58:29 2024 by modify_doxy.py rev. 669887