NCBI C++ ToolKit
Public Types | Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
CLDS2_Manager Class Reference

Search Toolkit Book for CLDS2_Manager

Class for managing LDS2 database and related data files. More...

#include <objtools/lds2/lds2.hpp>

+ Inheritance diagram for CLDS2_Manager:
+ Collaboration diagram for CLDS2_Manager:

Public Types

enum  EDirMode { eDir_NoRecurse , eDir_Recurse }
 Directory parsing mode while indexing files. More...
 
enum  EGBReleaseMode { eGB_Ignore , eGB_Guess , eGB_Force }
 Control indexing of GB releases (bioseq-sets). More...
 
enum  EDuplicateIdMode { eDuplicate_Skip , eDuplicate_Store , eDuplicate_Throw }
 Control seq-id conflict resolving during file parsing. More...
 
enum  EErrorMode { eError_Silent , eError_Report , eError_Throw }
 Error handling while indexing files. More...
 
- Public Types inherited from CObject
enum  EAllocFillMode { eAllocFillNone = 1 , eAllocFillZero , eAllocFillPattern }
 Control filling of newly allocated memory. More...
 
typedef CObjectCounterLocker TLockerType
 Default locker type for CRef. More...
 
typedef atomic< Uint8TCounter
 Counter type is CAtomiCounter. More...
 
typedef Uint8 TCount
 Alias for value type of counter. More...
 

Public Member Functions

 CLDS2_Manager (const string &db_file)
 Create LDS2 manager for the specified db file. More...
 
virtual ~CLDS2_Manager (void)
 
const stringGetDbFile (void) const
 Get currently selected database name. More...
 
CLDS2_DatabaseGetDatabase (void)
 Get the current database object. More...
 
void SetDbFile (const string &db_file)
 Select new database. More...
 
void AddDataFile (const string &data_file)
 Add new data file to the list. More...
 
void AddDataDir (const string &data_dir, EDirMode mode=eDir_Recurse)
 Add data directory. More...
 
void RegisterUrlHandler (CLDS2_UrlHandler_Base *handler)
 Register a URL handler. More...
 
void AddDataUrl (const string &url, const string &handler_name)
 Add a URL. More...
 
void ResetData (void)
 Remove all data from the database. More...
 
void UpdateData (void)
 Rescan all indexed files, check for modifications, update the database. More...
 
EGBReleaseMode GetGBReleaseMode (void) const
 
void SetGBReleaseMode (EGBReleaseMode mode)
 
EDuplicateIdMode GetDuplicateIdMode (void) const
 
void SetDuplicateIdMode (EDuplicateIdMode mode)
 
int GetSeqAlignGroupSize (void) const
 Control grouping of standalone seq-aligns into bigger blobs. More...
 
void SetSeqAlignGroupSize (int sz)
 
EErrorMode GetErrorMode (void) const
 
void SetErrorMode (EErrorMode mode)
 
CFastaReader::TFlags GetFastaFlags (void) const
 Fasta reader settings. More...
 
void SetFastaFlags (CFastaReader::TFlags flags)
 
- Public Member Functions inherited from CObject
 CObject (void)
 Constructor. More...
 
 CObject (const CObject &src)
 Copy constructor. More...
 
virtual ~CObject (void)
 Destructor. More...
 
CObjectoperator= (const CObject &src) THROWS_NONE
 Assignment operator. More...
 
bool CanBeDeleted (void) const THROWS_NONE
 Check if object can be deleted. More...
 
bool IsAllocatedInPool (void) const THROWS_NONE
 Check if object is allocated in memory pool (not system heap) More...
 
bool Referenced (void) const THROWS_NONE
 Check if object is referenced. More...
 
bool ReferencedOnlyOnce (void) const THROWS_NONE
 Check if object is referenced only once. More...
 
void AddReference (void) const
 Add reference to object. More...
 
void RemoveReference (void) const
 Remove reference to object. More...
 
void ReleaseReference (void) const
 Remove reference without deleting object. More...
 
virtual void DoNotDeleteThisObject (void)
 Mark this object as not allocated in heap – do not delete this object. More...
 
virtual void DoDeleteThisObject (void)
 Mark this object as allocated in heap – object can be deleted. More...
 
void * operator new (size_t size)
 Define new operator for memory allocation. More...
 
void * operator new[] (size_t size)
 Define new[] operator for 'array' memory allocation. More...
 
void operator delete (void *ptr)
 Define delete operator for memory deallocation. More...
 
void operator delete[] (void *ptr)
 Define delete[] operator for memory deallocation. More...
 
void * operator new (size_t size, void *place)
 Define new operator. More...
 
void operator delete (void *ptr, void *place)
 Define delete operator. More...
 
void * operator new (size_t size, CObjectMemoryPool *place)
 Define new operator using memory pool. More...
 
void operator delete (void *ptr, CObjectMemoryPool *place)
 Define delete operator. More...
 
virtual void DebugDump (CDebugDumpContext ddc, unsigned int depth) const
 Define method for dumping debug information. More...
 
- Public Member Functions inherited from CDebugDumpable
 CDebugDumpable (void)
 
virtual ~CDebugDumpable (void)
 
void DebugDumpText (ostream &out, const string &bundle, unsigned int depth) const
 
void DebugDumpFormat (CDebugDumpFormatter &ddf, const string &bundle, unsigned int depth) const
 
void DumpToConsole (void) const
 

Private Types

typedef CLDS2_Database::TStringSet TFiles
 
typedef map< string, CRef< CLDS2_UrlHandler_Base > > THandlers
 
typedef map< string, stringTHandlersByUrl
 

Private Member Functions

bool x_IsGZipFile (const SLDS2_File &file_info)
 
CLDS2_UrlHandler_Basex_GetUrlHandler (const SLDS2_File &file_info)
 
SLDS2_File x_GetFileInfo (const string &file_name, CRef< CLDS2_UrlHandler_Base > &handler)
 
void x_ParseFile (const SLDS2_File &info, CLDS2_UrlHandler_Base &handler)
 

Private Attributes

CRef< CLDS2_Databasem_Db
 
TFiles m_Files
 
THandlersByUrl m_HandlersByUrl
 
EGBReleaseMode m_GBReleaseMode
 
EDuplicateIdMode m_DupIdMode
 
EErrorMode m_ErrorMode
 
CFastaReader::TFlags m_FastaFlags
 
THandlers m_Handlers
 
int m_SeqAlignGroupSize
 

Additional Inherited Members

- Static Public Member Functions inherited from CObject
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (void)
 Define method to throw null pointer exception. More...
 
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (const type_info &type)
 
static EAllocFillMode GetAllocFillMode (void)
 
static void SetAllocFillMode (EAllocFillMode mode)
 
static void SetAllocFillMode (const string &value)
 Set mode from configuration parameter value. More...
 
- Static Public Member Functions inherited from CDebugDumpable
static void EnableDebugDump (bool on)
 
- Static Public Attributes inherited from CObject
static const TCount eCounterBitsCanBeDeleted = 1 << 0
 Define possible object states. More...
 
static const TCount eCounterBitsInPlainHeap = 1 << 1
 Heap signature was found. More...
 
static const TCount eCounterBitsPlaceMask
 Mask for 'in heap' state flags. More...
 
static const int eCounterStep = 1 << 2
 Skip over the "in heap" bits. More...
 
static const TCount eCounterValid = TCount(1) << (sizeof(TCount) * 8 - 2)
 Minimal value for valid objects (reference counter is zero) Must be a single bit value. More...
 
static const TCount eCounterStateMask
 Valid object, and object in heap. More...
 
- Protected Member Functions inherited from CObject
virtual void DeleteThis (void)
 Virtual method "deleting" this object. More...
 

Detailed Description

Class for managing LDS2 database and related data files.

Definition at line 45 of file lds2.hpp.

Member Typedef Documentation

◆ TFiles

Definition at line 145 of file lds2.hpp.

◆ THandlers

Definition at line 159 of file lds2.hpp.

◆ THandlersByUrl

Definition at line 161 of file lds2.hpp.

Member Enumeration Documentation

◆ EDirMode

Directory parsing mode while indexing files.

Enumerator
eDir_NoRecurse 

Do not parse sub-dirs automatically.

eDir_Recurse 

Automatically scan sub-directories (default).

Definition at line 71 of file lds2.hpp.

◆ EDuplicateIdMode

Control seq-id conflict resolving during file parsing.

Enumerator
eDuplicate_Skip 

Ignore bioseqs with duplicate ids, store just the first one.

eDuplicate_Store 

Store all bioseqs regardless of seq-id conflicts (defalut).

The conflict may be resolved later by data loader.

eDuplicate_Throw 

Throw exception on bioseqs with duplicate seq-ids.

Definition at line 109 of file lds2.hpp.

◆ EErrorMode

Error handling while indexing files.

NOTE: Only a few kinds of errors can be ignored (unsupported file format or object type, broken data file etc.).

Enumerator
eError_Silent 

Try to ignore errors, continue indexing.

eError_Report 

Print error messages, but do not fail (default).

eError_Throw 

Throw exceptions on errors.

Definition at line 131 of file lds2.hpp.

◆ EGBReleaseMode

Control indexing of GB releases (bioseq-sets).

Enumerator
eGB_Ignore 

Do not split bioseq-sets (default)

eGB_Guess 

Try to autodetect and split GB release bioseq-sets.

eGB_Force 

Split all top-level bioseq-sets into seq-entries.

Definition at line 99 of file lds2.hpp.

Constructor & Destructor Documentation

◆ CLDS2_Manager()

CLDS2_Manager::CLDS2_Manager ( const string db_file)

Create LDS2 manager for the specified db file.

If the file does not exist, it will be created only after adding at least one data file and indexing it.

Definition at line 879 of file lds2.cpp.

References RegisterUrlHandler(), and SetDbFile().

◆ ~CLDS2_Manager()

CLDS2_Manager::~CLDS2_Manager ( void  )
virtual

Definition at line 897 of file lds2.cpp.

Member Function Documentation

◆ AddDataDir()

void CLDS2_Manager::AddDataDir ( const string data_dir,
EDirMode  mode = eDir_Recurse 
)

Add data directory.

All files in the directory are added to the list. If the mode is eDir_Recurse, also adds all subdirectories. Call UpdateData to parse and index the files.

Definition at line 930 of file lds2.cpp.

References AddDataFile(), eDir_Recurse, CDir::fIgnoreRecursive, CDir::GetEntries(), CDirEntry::GetPath(), CDirEntry::IsDir(), CDirEntry::IsFile(), and ITERATE.

Referenced by CSplignApp::Run(), CLDS2IndexerApplication::Run(), and CDemoApp::Run().

◆ AddDataFile()

void CLDS2_Manager::AddDataFile ( const string data_file)

Add new data file to the list.

This will not parse and index the new file - call UpdateData().

Definition at line 923 of file lds2.cpp.

References CDirEntry::CreateAbsolutePath(), set< Key, Compare >::insert(), and m_Files.

Referenced by AddDataDir().

◆ AddDataUrl()

void CLDS2_Manager::AddDataUrl ( const string url,
const string handler_name 
)

Add a URL.

The handler is used to access the URL and must be registered in the manager before adding the URL.

Definition at line 1192 of file lds2.cpp.

References set< Key, Compare >::insert(), m_Files, and m_HandlersByUrl.

◆ GetDatabase()

CLDS2_Database* CLDS2_Manager::GetDatabase ( void  )
inline

Get the current database object.

Definition at line 59 of file lds2.hpp.

Referenced by CLDS2IndexerApplication::Run().

◆ GetDbFile()

const string & CLDS2_Manager::GetDbFile ( void  ) const
inline

Get currently selected database name.

Definition at line 176 of file lds2.hpp.

References _ASSERT, CLDS2_Database::GetDbFile(), and m_Db.

◆ GetDuplicateIdMode()

EDuplicateIdMode CLDS2_Manager::GetDuplicateIdMode ( void  ) const
inline

Definition at line 119 of file lds2.hpp.

Referenced by CLDS2_ObjectParser::EndBlob().

◆ GetErrorMode()

EErrorMode CLDS2_Manager::GetErrorMode ( void  ) const
inline

Definition at line 137 of file lds2.hpp.

◆ GetFastaFlags()

CFastaReader::TFlags CLDS2_Manager::GetFastaFlags ( void  ) const
inline

Fasta reader settings.

Definition at line 141 of file lds2.hpp.

◆ GetGBReleaseMode()

EGBReleaseMode CLDS2_Manager::GetGBReleaseMode ( void  ) const
inline

Definition at line 105 of file lds2.hpp.

◆ GetSeqAlignGroupSize()

int CLDS2_Manager::GetSeqAlignGroupSize ( void  ) const
inline

Control grouping of standalone seq-aligns into bigger blobs.

If set to 0 or 1, no grouping is performed, each seq-align becomes a separate blob.

Definition at line 125 of file lds2.hpp.

◆ RegisterUrlHandler()

void CLDS2_Manager::RegisterUrlHandler ( CLDS2_UrlHandler_Base handler)

Register a URL handler.

Using handlers allows to use special storage types like compressed files, ftp or http locations etc. The same handler must be registered in the data loader when using LDS2 to fetch data. The default handlers "file" and "gzipfile" for local files are registered automatically.

Definition at line 902 of file lds2.cpp.

References _ASSERT, m_Handlers, and Ref().

Referenced by CLDS2_Manager().

◆ ResetData()

void CLDS2_Manager::ResetData ( void  )

Remove all data from the database.

Definition at line 916 of file lds2.cpp.

References set< Key, Compare >::clear(), CLDS2_Database::Create(), m_Db, and m_Files.

◆ SetDbFile()

void CLDS2_Manager::SetDbFile ( const string db_file)

Select new database.

If the database does not yet exist, it is not created immediately. The list of data files is cleared.

Definition at line 909 of file lds2.cpp.

References set< Key, Compare >::clear(), m_Db, m_Files, and CRef< C, Locker >::Reset().

Referenced by CLDS2_Manager().

◆ SetDuplicateIdMode()

void CLDS2_Manager::SetDuplicateIdMode ( EDuplicateIdMode  mode)
inline

Definition at line 120 of file lds2.hpp.

◆ SetErrorMode()

void CLDS2_Manager::SetErrorMode ( EErrorMode  mode)
inline

Definition at line 138 of file lds2.hpp.

◆ SetFastaFlags()

void CLDS2_Manager::SetFastaFlags ( CFastaReader::TFlags  flags)
inline

Definition at line 142 of file lds2.hpp.

References flags.

◆ SetGBReleaseMode()

void CLDS2_Manager::SetGBReleaseMode ( EGBReleaseMode  mode)
inline

Definition at line 106 of file lds2.hpp.

Referenced by CLDS2IndexerApplication::Run().

◆ SetSeqAlignGroupSize()

void CLDS2_Manager::SetSeqAlignGroupSize ( int  sz)
inline

Definition at line 126 of file lds2.hpp.

Referenced by CLDS2IndexerApplication::Run().

◆ UpdateData()

void CLDS2_Manager::UpdateData ( void  )

◆ x_GetFileInfo()

SLDS2_File CLDS2_Manager::x_GetFileInfo ( const string file_name,
CRef< CLDS2_UrlHandler_Base > &  handler 
)
private

Definition at line 985 of file lds2.cpp.

References file_name, info, and x_GetUrlHandler().

Referenced by UpdateData().

◆ x_GetUrlHandler()

CLDS2_UrlHandler_Base * CLDS2_Manager::x_GetUrlHandler ( const SLDS2_File file_info)
private

◆ x_IsGZipFile()

bool CLDS2_Manager::x_IsGZipFile ( const SLDS2_File file_info)
private

Definition at line 946 of file lds2.cpp.

References CFormatGuess::eGZip, CFormatGuess::Format(), in(), and SLDS2_File::name.

Referenced by x_GetUrlHandler().

◆ x_ParseFile()

void CLDS2_Manager::x_ParseFile ( const SLDS2_File info,
CLDS2_UrlHandler_Base handler 
)
private

Member Data Documentation

◆ m_Db

CRef<CLDS2_Database> CLDS2_Manager::m_Db
private

Definition at line 163 of file lds2.hpp.

Referenced by GetDbFile(), ResetData(), SetDbFile(), UpdateData(), and x_ParseFile().

◆ m_DupIdMode

EDuplicateIdMode CLDS2_Manager::m_DupIdMode
private

Definition at line 167 of file lds2.hpp.

◆ m_ErrorMode

EErrorMode CLDS2_Manager::m_ErrorMode
private

Definition at line 168 of file lds2.hpp.

Referenced by UpdateData(), x_GetUrlHandler(), and x_ParseFile().

◆ m_FastaFlags

CFastaReader::TFlags CLDS2_Manager::m_FastaFlags
private

Definition at line 169 of file lds2.hpp.

Referenced by x_ParseFile().

◆ m_Files

TFiles CLDS2_Manager::m_Files
private

Definition at line 164 of file lds2.hpp.

Referenced by AddDataFile(), AddDataUrl(), ResetData(), SetDbFile(), and UpdateData().

◆ m_GBReleaseMode

EGBReleaseMode CLDS2_Manager::m_GBReleaseMode
private

Definition at line 166 of file lds2.hpp.

◆ m_Handlers

THandlers CLDS2_Manager::m_Handlers
private

Definition at line 170 of file lds2.hpp.

Referenced by RegisterUrlHandler(), and x_GetUrlHandler().

◆ m_HandlersByUrl

THandlersByUrl CLDS2_Manager::m_HandlersByUrl
private

Definition at line 165 of file lds2.hpp.

Referenced by AddDataUrl(), and x_GetUrlHandler().

◆ m_SeqAlignGroupSize

int CLDS2_Manager::m_SeqAlignGroupSize
private

Definition at line 171 of file lds2.hpp.


The documentation for this class was generated from the following files:
Modified on Fri Sep 20 14:58:00 2024 by modify_doxy.py rev. 669887