NCBI C++ ToolKit
Public Member Functions | Private Member Functions | Private Attributes | List of all members
CWriteDB_IndexFile Class Reference

Search Toolkit Book for CWriteDB_IndexFile

This class builds the volume index file (pin or nin). More...

#include <objtools/blast/seqdb_writer/writedb_files.hpp>

+ Inheritance diagram for CWriteDB_IndexFile:
+ Collaboration diagram for CWriteDB_IndexFile:

Public Member Functions

 CWriteDB_IndexFile (const string &dbname, bool protein, const string &title, const string &date, int index, Uint8 max_file_size, EBlastDbVersion dbver=eBDB_Version4)
 Constructor. More...
 
bool CanFit ()
 Returns true if another sequence can fit into the file. More...
 
void AddSequence (int length, unsigned int hdr, unsigned int seq)
 Add a sequence to a protein index file (pin). More...
 
void AddSequence (int length, unsigned int hdr, unsigned int seq, unsigned int amb)
 Add a sequence to a nucleotide index file (nin). More...
 
- Public Member Functions inherited from CWriteDB_File
 CWriteDB_File (const string &basename, const string &extension, int index, Uint8 max_file_size, bool always_create)
 Constructor. More...
 
void Create ()
 Create and open the file. More...
 
unsigned int Write (const CTempString &data)
 Write contents of a string to the file. More...
 
unsigned int Write (const char *data, int length)
 
unsigned int WriteInt4 (int data)
 Write an Int4 (in bigendian order) to the file. More...
 
unsigned int WriteInt8 (Int8 data)
 Write an Int8 (in bigendian order) to the file. More...
 
unsigned int WriteWithNull (const CTempString &data)
 Write contents of a string to the file, appending a NUL. More...
 
void Close ()
 Close the file, flushing any remaining data to disk. More...
 
virtual void RenameSingle ()
 Rename this file, disincluding the volume index. More...
 
virtual void RenameFileIndex (unsigned int num_digits)
 
const stringGetFilename () const
 Get the current filename for this file. More...
 
- Public Member Functions inherited from CObject
 CObject (void)
 Constructor. More...
 
 CObject (const CObject &src)
 Copy constructor. More...
 
virtual ~CObject (void)
 Destructor. More...
 
CObjectoperator= (const CObject &src) THROWS_NONE
 Assignment operator. More...
 
bool CanBeDeleted (void) const THROWS_NONE
 Check if object can be deleted. More...
 
bool IsAllocatedInPool (void) const THROWS_NONE
 Check if object is allocated in memory pool (not system heap) More...
 
bool Referenced (void) const THROWS_NONE
 Check if object is referenced. More...
 
bool ReferencedOnlyOnce (void) const THROWS_NONE
 Check if object is referenced only once. More...
 
void AddReference (void) const
 Add reference to object. More...
 
void RemoveReference (void) const
 Remove reference to object. More...
 
void ReleaseReference (void) const
 Remove reference without deleting object. More...
 
virtual void DoNotDeleteThisObject (void)
 Mark this object as not allocated in heap – do not delete this object. More...
 
virtual void DoDeleteThisObject (void)
 Mark this object as allocated in heap – object can be deleted. More...
 
void * operator new (size_t size)
 Define new operator for memory allocation. More...
 
void * operator new[] (size_t size)
 Define new[] operator for 'array' memory allocation. More...
 
void operator delete (void *ptr)
 Define delete operator for memory deallocation. More...
 
void operator delete[] (void *ptr)
 Define delete[] operator for memory deallocation. More...
 
void * operator new (size_t size, void *place)
 Define new operator. More...
 
void operator delete (void *ptr, void *place)
 Define delete operator. More...
 
void * operator new (size_t size, CObjectMemoryPool *place)
 Define new operator using memory pool. More...
 
void operator delete (void *ptr, CObjectMemoryPool *place)
 Define delete operator. More...
 
virtual void DebugDump (CDebugDumpContext ddc, unsigned int depth) const
 Define method for dumping debug information. More...
 
- Public Member Functions inherited from CDebugDumpable
 CDebugDumpable (void)
 
virtual ~CDebugDumpable (void)
 
void DebugDumpText (ostream &out, const string &bundle, unsigned int depth) const
 
void DebugDumpFormat (CDebugDumpFormatter &ddf, const string &bundle, unsigned int depth) const
 
void DumpToConsole (void) const
 

Private Member Functions

int x_Overhead (const string &T, const string &lmdbName, const string &D)
 Compute index file overhead. More...
 
int x_Overhead (const string &T, const string &D)
 Compute index file overhead. More...
 
virtual void x_Flush ()
 Flush index data to disk. More...
 
const string x_MakeLmdbName ()
 Form name of LMDB database file. More...
 

Private Attributes

bool m_Protein
 True if this is a protein database. More...
 
string m_Title
 Title string for all database volumes. More...
 
string m_Date
 Database creation time stamp. More...
 
int m_OIDs
 OIDs added to database so far. More...
 
int m_Overhead
 Amount of file used by metadata. More...
 
Uint8 m_DataSize
 Required space for data once written to disk. More...
 
Uint8 m_Letters
 Letters of sequence data accumulated so far. More...
 
int m_MaxLength
 Length of longest sequence. More...
 
vector< unsigned intm_Hdr
 Start offset in header file of each OID's headers. More...
 
vector< unsigned intm_Seq
 Offset in sequence file of each OID's sequence data. More...
 
vector< unsigned intm_Amb
 Offset in sequence file of each OID's ambiguity data. More...
 
EBlastDbVersion m_Version
 BLASTDB version (4 or 5). More...
 

Additional Inherited Members

- Public Types inherited from CObject
enum  EAllocFillMode { eAllocFillNone = 1 , eAllocFillZero , eAllocFillPattern }
 Control filling of newly allocated memory. More...
 
typedef CObjectCounterLocker TLockerType
 Default locker type for CRef. More...
 
typedef atomic< Uint8TCounter
 Counter type is CAtomiCounter. More...
 
typedef Uint8 TCount
 Alias for value type of counter. More...
 
- Static Public Member Functions inherited from CWriteDB_File
static string MakeShortName (const string &base, int index)
 Construct the short name for a volume. More...
 
- Static Public Member Functions inherited from CObject
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (void)
 Define method to throw null pointer exception. More...
 
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (const type_info &type)
 
static EAllocFillMode GetAllocFillMode (void)
 
static void SetAllocFillMode (EAllocFillMode mode)
 
static void SetAllocFillMode (const string &value)
 Set mode from configuration parameter value. More...
 
- Static Public Member Functions inherited from CDebugDumpable
static void EnableDebugDump (bool on)
 
- Static Public Attributes inherited from CObject
static const TCount eCounterBitsCanBeDeleted = 1 << 0
 Define possible object states. More...
 
static const TCount eCounterBitsInPlainHeap = 1 << 1
 Heap signature was found. More...
 
static const TCount eCounterBitsPlaceMask
 Mask for 'in heap' state flags. More...
 
static const int eCounterStep = 1 << 2
 Skip over the "in heap" bits. More...
 
static const TCount eCounterValid = TCount(1) << (sizeof(TCount) * 8 - 2)
 Minimal value for valid objects (reference counter is zero) Must be a single bit value. More...
 
static const TCount eCounterStateMask
 Valid object, and object in heap. More...
 
- Protected Types inherited from CWriteDB_File
typedef ofstream TFile
 Underlying 'output file' type used here. More...
 
- Protected Member Functions inherited from CWriteDB_File
Uint8 x_DefaultByteLimit ()
 The default value for max_file_size. More...
 
void x_MakeFileName ()
 Build the filename for this file. More...
 
- Protected Member Functions inherited from CObject
virtual void DeleteThis (void)
 Virtual method "deleting" this object. More...
 
- Protected Attributes inherited from CWriteDB_File
bool m_Created
 True if the file has already been opened. More...
 
string m_Nul
 For convenience, a string containing one NUL character. More...
 
string m_BaseName
 Database base name for all files. More...
 
string m_Extension
 File extension for this file. More...
 
int m_Index
 Volume index. More...
 
unsigned int m_Offset
 Stream position. More...
 
Uint8 m_MaxFileSize
 Maximum file size in bytes. More...
 
bool m_UseIndex
 True if filenames should use volume index. More...
 
string m_Fname
 Current filename for output file. More...
 
TFile m_RealFile
 Actual stream implementing the output file. More...
 

Detailed Description

This class builds the volume index file (pin or nin).

Definition at line 210 of file writedb_files.hpp.

Constructor & Destructor Documentation

◆ CWriteDB_IndexFile()

CWriteDB_IndexFile::CWriteDB_IndexFile ( const string dbname,
bool  protein,
const string title,
const string date,
int  index,
Uint8  max_file_size,
EBlastDbVersion  dbver = eBDB_Version4 
)

Constructor.

Parameters
dbnameDatabase base name.
proteinTrue for protein volumes.
titleDatabase title string.
dateTimestamp of database construction start.
indexIndex of this volume.
max_file_sizeMaximum file size in bytes (or zero).

Definition at line 306 of file writedb_files.cpp.

References eBDB_Version5, m_DataSize, m_Hdr, m_Overhead, m_Seq, s_RoundUp(), x_MakeLmdbName(), and x_Overhead().

Member Function Documentation

◆ AddSequence() [1/2]

void CWriteDB_IndexFile::AddSequence ( int  length,
unsigned int  hdr,
unsigned int  seq 
)
inline

Add a sequence to a protein index file (pin).

The index file does not need sequence data, so this method only needs offsets of the data in other files.

Parameters
Sequencelength in letters.
hdrLength of binary ASN.1 header data.
seqLength in bytes of sequence data.

Definition at line 246 of file writedb_files.hpp.

References m_DataSize, m_Hdr, m_Letters, m_MaxLength, m_OIDs, and m_Seq.

Referenced by CWriteDB_Volume::WriteSequence().

◆ AddSequence() [2/2]

void CWriteDB_IndexFile::AddSequence ( int  length,
unsigned int  hdr,
unsigned int  seq,
unsigned int  amb 
)
inline

Add a sequence to a nucleotide index file (nin).

The index file does not need sequence data, so this method only needs offsets of the data in other files.

Parameters
Sequencelength in letters.
hdrLength of binary ASN.1 header data.
seqLength in bytes of packed sequence data.
ambLength in bytes of packed ambiguity data.

Definition at line 269 of file writedb_files.hpp.

References m_Amb, m_DataSize, m_Hdr, m_Letters, m_MaxLength, m_OIDs, and m_Seq.

◆ CanFit()

bool CWriteDB_IndexFile::CanFit ( )
inline

Returns true if another sequence can fit into the file.

Definition at line 228 of file writedb_files.hpp.

References _ASSERT, m_DataSize, CWriteDB_File::m_MaxFileSize, and m_OIDs.

Referenced by CWriteDB_Volume::WriteSequence().

◆ x_Flush()

void CWriteDB_IndexFile::x_Flush ( void  )
privatevirtual

◆ x_MakeLmdbName()

const string CWriteDB_IndexFile::x_MakeLmdbName ( )
private

Form name of LMDB database file.

Form name of lmdb database file.

Definition at line 439 of file writedb_files.cpp.

References CDirEntry::GetPathSeparator(), CWriteDB_File::m_BaseName, m_Protein, and suffix.

Referenced by CWriteDB_IndexFile(), and x_Flush().

◆ x_Overhead() [1/2]

int CWriteDB_IndexFile::x_Overhead ( const string T,
const string D 
)
private

Compute index file overhead.

This is the overhead used by all fields of the index file, and does account for padding. (version 4)

Parameters
TTitle string.
DCreate time string.
Returns
Combined size of all meta-data fields in nin/pin file.

Definition at line 355 of file writedb_files.cpp.

References int, and T.

◆ x_Overhead() [2/2]

int CWriteDB_IndexFile::x_Overhead ( const string T,
const string lmdbName,
const string D 
)
private

Compute index file overhead.

This is the overhead used by all fields of the index file, and does account for padding. (version 5)

Parameters
TTitle string.
LMDBfile name string.
DCreate time string.
Returns
Combined size of all meta-data fields in nin/pin file.

Definition at line 347 of file writedb_files.cpp.

References int, and T.

Referenced by CWriteDB_IndexFile(), and x_Flush().

Member Data Documentation

◆ m_Amb

vector<unsigned int> CWriteDB_IndexFile::m_Amb
private

Offset in sequence file of each OID's ambiguity data.

The end of the ambiguity data is given by the start offset of the sequence data for the next OID.

Definition at line 340 of file writedb_files.hpp.

Referenced by AddSequence(), and x_Flush().

◆ m_DataSize

Uint8 CWriteDB_IndexFile::m_DataSize
private

Required space for data once written to disk.

Definition at line 315 of file writedb_files.hpp.

Referenced by AddSequence(), CanFit(), and CWriteDB_IndexFile().

◆ m_Date

string CWriteDB_IndexFile::m_Date
private

Database creation time stamp.

Definition at line 312 of file writedb_files.hpp.

Referenced by x_Flush().

◆ m_Hdr

vector<unsigned int> CWriteDB_IndexFile::m_Hdr
private

Start offset in header file of each OID's headers.

The end offset is given by the start offset of the following OID's headers.

Definition at line 328 of file writedb_files.hpp.

Referenced by AddSequence(), CWriteDB_IndexFile(), and x_Flush().

◆ m_Letters

Uint8 CWriteDB_IndexFile::m_Letters
private

Letters of sequence data accumulated so far.

Definition at line 316 of file writedb_files.hpp.

Referenced by AddSequence(), and x_Flush().

◆ m_MaxLength

int CWriteDB_IndexFile::m_MaxLength
private

Length of longest sequence.

Definition at line 317 of file writedb_files.hpp.

Referenced by AddSequence(), and x_Flush().

◆ m_OIDs

int CWriteDB_IndexFile::m_OIDs
private

OIDs added to database so far.

Definition at line 313 of file writedb_files.hpp.

Referenced by AddSequence(), CanFit(), and x_Flush().

◆ m_Overhead

int CWriteDB_IndexFile::m_Overhead
private

Amount of file used by metadata.

Definition at line 314 of file writedb_files.hpp.

Referenced by CWriteDB_IndexFile().

◆ m_Protein

bool CWriteDB_IndexFile::m_Protein
private

True if this is a protein database.

Definition at line 310 of file writedb_files.hpp.

Referenced by x_Flush(), and x_MakeLmdbName().

◆ m_Seq

vector<unsigned int> CWriteDB_IndexFile::m_Seq
private

Offset in sequence file of each OID's sequence data.

The end of the sequence data is given by the start offset of the ambiguity data for the same OID.

Definition at line 334 of file writedb_files.hpp.

Referenced by AddSequence(), CWriteDB_IndexFile(), and x_Flush().

◆ m_Title

string CWriteDB_IndexFile::m_Title
private

Title string for all database volumes.

Definition at line 311 of file writedb_files.hpp.

Referenced by x_Flush().

◆ m_Version

EBlastDbVersion CWriteDB_IndexFile::m_Version
private

BLASTDB version (4 or 5).

Definition at line 342 of file writedb_files.hpp.

Referenced by x_Flush().


The documentation for this class was generated from the following files:
Modified on Fri Apr 12 17:16:51 2024 by modify_doxy.py rev. 669887