NCBI C++ ToolKit
Classes | Public Types | Public Member Functions | Protected Types | Protected Member Functions | Protected Attributes | Private Member Functions | List of all members
CBDB_BlobSplitStore< TBV, TObjDeMux, TL > Class Template Reference

Search Toolkit Book for CBDB_BlobSplitStore

BLOB storage based on single unsigned integer key Supports BLOB volumes and different base page size files in the volume to guarantee the best fit. More...

#include <db/bdb/bdb_split_blob.hpp>

+ Inheritance diagram for CBDB_BlobSplitStore< TBV, TObjDeMux, TL >:
+ Collaboration diagram for CBDB_BlobSplitStore< TBV, TObjDeMux, TL >:

Classes

struct  SLockedDb
 BDB Database together with the locker One database is opened twice, one regular mode, another - dedicated read-only instance to improve concurrency. More...
 
struct  SVolume
 Volume split on optimal page size. More...
 

Public Types

typedef CIdDeMux< TBV > TIdDeMux
 
typedef TBV TBitVector
 
typedef CBDB_BlobStoreDict< TBV > TDeMuxStore
 
typedef TL TLock
 
typedef TL::TWriteLockGuard TLockGuard
 
typedef CBDB_IdBlobFile TBlobFile
 
typedef vector< SVolume * > TVolumeVect
 

Public Member Functions

 CBDB_BlobSplitStore (TObjDeMux *de_mux)
 Construction The main parameter here is object demultiplexer for splitting incoming LOBs into volumes and slices. More...
 
 ~CBDB_BlobSplitStore ()
 
void Open (const string &storage_name, CBDB_RawFile::EOpenMode open_mode, CBDB_RawFile::EDBType db_type=CBDB_RawFile::eBtree)
 Open storage (reads storage dictionary into memory) More...
 
bool IsOpen () const
 Return true if the split store has been opened. More...
 
void OpenProjections ()
 Try to open all storage files in all projections This is only possible when object de-mux has fixed number of projections, if it is not the call is silently ignored. More...
 
void Save (typename TDeMuxStore::ECompact compact_vectors=TDeMuxStore::eCompact)
 Save storage dictionary (demux disposition). More...
 
void SetVolumeCacheSize (unsigned int cache_size)
 
void SetEnv (CBDB_Env &env)
 Associate with the environment. Should be called before opening. More...
 
CBDB_EnvGetEnv (void) const
 Get pointer on file environment Return NULL if no environment has been set. More...
 
const stringGetFileName () const
 Return the base filename of the underlying split store. More...
 
void RevSplitOff ()
 Turn off reverse splitting on the underlying stores. More...
 
void SetCachePriority (CBDB_RawFile::ECachePriority)
 Set the priority for this database's pages in the buffer cache This is generally a temporary advisement, and works only if an environment is used. More...
 
virtual void SetTransaction (ITransaction *trans)
 Establish transaction association. More...
 
CBDB_TransactionGetBDBTransaction ()
 
EBDB_ErrCode Insert (unsigned id, const void *data, size_t size, unsigned *coord)
 Insert BLOB into the storage. More...
 
EBDB_ErrCode Insert (unsigned id, const void *data, size_t size)
 
EBDB_ErrCode UpdateInsert (unsigned id, const void *data, size_t size, unsigned *coord)
 Update or insert BLOB. More...
 
EBDB_ErrCode UpdateInsert (unsigned id, const void *data, size_t size)
 
EBDB_ErrCode UpdateInsert (unsigned id, const unsigned *old_coord, const void *data, size_t size, unsigned *coord)
 Update or insert BLOB using old coordinates. More...
 
EBDB_ErrCode Delete (unsigned id, CBDB_RawFile::EIgnoreError on_error=CBDB_RawFile::eThrowOnError)
 Delete BLOB. More...
 
EBDB_ErrCode Delete (unsigned id, const unsigned *coords, CBDB_RawFile::EIgnoreError on_error=CBDB_RawFile::eThrowOnError)
 
EBDB_ErrCode GetCoordinates (unsigned id, unsigned *coords)
 Find (demux) coordinates by BLOB id. More...
 
void AssignCoordinates (unsigned id, const unsigned *coords)
 Assing de-mux coordinates. More...
 
bool IsSameCoordinates (const unsigned *coords1, const unsigned *coords2)
 Returns true if two sets of coordinates are the same. More...
 
EBDB_ErrCode ReadRealloc (unsigned id, CBDB_RawFile::TBuffer &buffer)
 Read BLOB into vector. More...
 
EBDB_ErrCode ReadRealloc (unsigned id, const unsigned *coords, CBDB_RawFile::TBuffer &buffer)
 Read BLOB into vector using provided coordinates If BLOB does not fit, method resizes the vector to accomodate. More...
 
EBDB_ErrCode Fetch (unsigned id, void **buf, size_t buf_size, CBDB_RawFile::EReallocMode allow_realloc, size_t *blob_size)
 Fetch LOB record directly into the provided '*buf'. More...
 
EBDB_ErrCode Fetch (unsigned id, const unsigned *coords, void **buf, size_t buf_size, CBDB_RawFile::EReallocMode allow_realloc, size_t *blob_size)
 
void Sync ()
 Sync the underlying stores. More...
 
IReaderCreateReader (unsigned id)
 Create stream oriented reader. More...
 
IReaderCreateReader (unsigned id, const unsigned *coords)
 
EBDB_ErrCode BlobSize (unsigned id, size_t *blob_size)
 Get size of the BLOB. More...
 
EBDB_ErrCode BlobSize (unsigned id, const unsigned *coords, size_t *blob_size)
 
void GetIdVector (TBitVector *bv) const
 Get all id of all BLOBs stored. More...
 
void FreeUnusedMem ()
 Reclaim unused memory. More...
 
- Public Member Functions inherited from CThreadLocalTransactional
virtual ITransactionGetTransaction ()
 Get current transaction. More...
 
virtual void RemoveTransaction (ITransaction *trans)
 Remove transaction association (must be established by SetTransaction. More...
 
- Public Member Functions inherited from ITransactional
virtual ~ITransactional ()
 

Protected Types

enum  EGetDB_Mode { eGetRead , eGetWrite }
 Read or write operation. More...
 
- Protected Types inherited from CThreadLocalTransactional
typedef map< CThread::TID, ITransaction * > TThreadCtxMap
 

Protected Member Functions

void CloseVolumes ()
 Close volumes without saving or doing anything with id demux. More...
 
void LoadIdDeMux (TIdDeMux &de_mux, TDeMuxStore &dict_file)
 
void SaveIdDeMux (const TIdDeMux &de_mux, TDeMuxStore &dict_file, CBDB_Transaction *trans, typename TDeMuxStore::ECompact compact_vectors)
 Store id demux (projection vectors) into the database file. More...
 
unsigned GetPageSize (unsigned splice) const
 Select preferred page size for the specified slice. More...
 
void OpenDict ()
 Open split storage dictionary. More...
 
string MakeDbFileName (unsigned vol, unsigned slice)
 Make BDB file name based on volume and page size split. More...
 
SLockedDbGetDb (unsigned vol, unsigned slice, EGetDB_Mode get_mode)
 Get database pair (method opens and mounts database if necessary) More...
 
void InitDbMutex (SLockedDb *ldb)
 Init database mutex lock (mathod is protected against double init) More...
 

Protected Attributes

int m_TransAssociation
 
vector< unsigned > m_PageSizes
 
unsigned m_VolumeCacheSize
 
CBDB_Envm_Env
 
unique_ptr< TDeMuxStorem_DictFile
 Split dictionary(id demux file) More...
 
TLock m_DictFileLock
 id demux file locker More...
 
unique_ptr< TIdDeMuxm_IdDeMux
 Id to coordinates mapper. More...
 
CRWLock m_IdDeMuxLock
 
unique_ptr< TObjDeMux > m_ObjDeMux
 Obj to coordinates mapper. More...
 
TLock m_ObjDeMuxLock
 
TVolumeVect m_Volumes
 Volumes. More...
 
TLock m_VolumesLock
 Volumes locker. More...
 
string m_StorageName
 
CBDB_RawFile::EOpenMode m_OpenMode
 
CBDB_RawFile::EDBType m_DB_Type
 
CBDB_RawFile::ECachePriority m_CachePriority
 
bool m_AllProjAvail
 True when all proj.dbs are pre-open. More...
 
bool m_RevSplitOff
 Flag carrying reverse split status. More...
 
TLock m_CrossDBLock
 Lock used to sync. muli-db transactions to avoid deadlocks. More...
 
- Protected Attributes inherited from CThreadLocalTransactional
TThreadCtxMap m_ThreadMap
 
CFastMutex m_ThreadMapLock
 

Private Member Functions

 CBDB_BlobSplitStore (const CBDB_BlobSplitStore< TBV, TObjDeMux, TL > &)
 forbidden More...
 
CBDB_BlobSplitStore< TBV, TObjDeMux, TL > & operator= (const CBDB_BlobSplitStore< TBV, TObjDeMux, TL > &)
 

Detailed Description

template<class TBV, class TObjDeMux = CBDB_BlobDeMux, class TL = CFastMutex>
class CBDB_BlobSplitStore< TBV, TObjDeMux, TL >

BLOB storage based on single unsigned integer key Supports BLOB volumes and different base page size files in the volume to guarantee the best fit.

Problem. Berkeley DB shows measurable difference in behavior and performance depending on the combination of record size and database page size. Differences include amount of disk traffic, locking granularity, number of overflow pages, etc.

The most critical here is overflow pages. If DB page cannot accommodate 2(sometimes more) records BDB creates overflow pages. This is found to be expensive. The typical fix is to increase the page size. Large page size is inefficient for dealing with small record (you have to load/store 64K (full page) to load small object. In transaction environment page access are also locks a lot of records. Page size also influences B-Tree depth and number of internal pages. Number of internal pages affects database size and retrieval performance.

Object maintains a matrix of BDB databases. Every row maintains certain database volume or(and) number of records. Every column groups BLOBs of certain size together, so class can choose the best page size to store BLOBs without long chains of overflow pages.

                     Page size split:
 Volume
 split:        4K     8K     16K    32K
             +------+------+------+------+
 row = 0     | DB   | ...................|  = SUM = N Gbytes
 row = 1     | DB   | .....              |  = SUM = N GBytes

               .........................

             +------+------+------+------+

Matrix coordinates picking is implemented using concept called DeMux. It maintains BLOB_ID <-> coordinates association. Demux implementation(s) use bit-vectors to do the job. BLOB ID must be unique across the store. In general DeMux can work with N-dimensional coordinates to address host, partition, volume, slice (distributed store). But current practical implementation uses 2D matrix (volume, slice).

Definition at line 356 of file bdb_split_blob.hpp.


The documentation for this class was generated from the following file:
Modified on Fri Sep 20 14:58:04 2024 by modify_doxy.py rev. 669887