NCBI C++ ToolKit
Public Types | Public Member Functions | Private Types | Private Member Functions | Private Attributes | List of all members
CSubjectMap Class Reference

Search Toolkit Book for CSubjectMap

Type representing subject map data. More...

#include <algo/blast/dbindex/dbindex.hpp>

+ Collaboration diagram for CSubjectMap:

Public Types

typedef pair< TSeqNum, TSeqPosTSOPair
 
typedef pair< TSeqNum, TSeqNumTSCPair
 
typedef vector< TSCPairTSCPairMap
 

Public Member Functions

 CSubjectMap ()
 Trivial constructor. More...
 
 CSubjectMap (TWord **map, TSeqNum start, TSeqNum stop, unsigned long stride)
 Constructs object by mapping to the memory segment. More...
 
 CSubjectMap (TWord **map, const SIndexHeader &header)
 
void Load (TWord **map, TSeqNum start, TSeqNum stop, unsigned long stride)
 Loads index by mapping to the memory segment. More...
 
const TWordGetSubjectMap () const
 Provides a mapping from real subject ids and chunk numbers to internal logical subject ids. More...
 
const Uint1GetSeqStoreBase () const
 Return the start of the raw storage for compressed subject sequence data. More...
 
TWord GetSeqStoreSize () const
 Return the size in bytes of the eaw sequence storage. More...
 
TSeqNum NumChunks () const
 Get the total number of sequence chunks in the map. More...
 
TSeqNum GetNumChunks (TSeqNum lid) const
 Get number of chunks combined into a given logical sequence. More...
 
TSeqNum MapSubject (TSeqNum subject, TSeqNum chunk) const
 Get the logical sequence id from the database oid and the chunk number. More...
 
unsigned long GetStride () const
 Accessor for stride value. More...
 
std::pair< TSeqNum, TSeqPosDecodeOffset (TWord offset) const
 Decode offset. More...
 
void SetSubjInfo (TSeqNum subj, TWord &start, TWord &end) const
 Return the subject information based on the given logical subject id. More...
 
std::pair< TSeqNum, TSeqPosMapSubjOff (TSeqNum lid, TSeqPos soff) const
 Map logical sequence id and logical sequence offset to relative chunk number and chunk offset. More...
 
TSeqNum MapLId2Chunk (TSeqNum lid, TSeqNum lchunk) const
 Map logical id and relative chunk to absolute chunk id. More...
 
TSeqNum NumSubjects () const
 Get the total number of logical sequences in the map. More...
 
TSeqPos GetSeqLen (TSeqNum oid) const
 Get the length of the subject sequence. More...
 
const Uint1GetSeqData (TSeqNum oid) const
 Get the sequence data of the subject sequence. More...
 
TWord getSubjectLength (TSeqNum sid) const
 
TSeqNum getCId (TSeqNum sid, TSeqNum rcid) const
 
TSCPair getSRCId (TSeqNum cid) const
 
TWord getChunkLength (TSeqNum cid) const
 
TSeqNum getCIdByLRCId (TSeqNum lid, TSeqNum rcid) const
 
TSOPair getRCIdOffByLIdOff (TSeqNum lid, TSeqPos loff) const
 
TSeqPos getSOff (TSeqNum sid, TSeqNum rcid, TSeqPos coff) const
 
TSeqNum getNumSubjects () const
 
TSeqNum getNumChunks () const
 
TSeqNum getNumChunks (TSeqNum sid) const
 
const Uint1getSeqData (TSeqNum sid) const
 
TSeqNum getLId (const TOffsetValue &v) const
 
TSeqPos getLOff (const TOffsetValue &v) const
 

Private Types

typedef CDbIndex::TSeqNum TSeqNum
 
typedef CDbIndex::TWord TWord
 
typedef CDbIndex::TOffsetValue TOffsetValue
 
typedef CVectorWrap< TWordTSubjects
 Type used to map database oids to the chunk info. More...
 
typedef CVectorWrap< Uint1TSeqStore
 Type used for compressed subject sequence data storage. More...
 
typedef CVectorWrap< TWordTChunks
 Type for storing the chunk data. More...
 
typedef CVectorWrap< TWordTLengths
 Subject lengths storage type. More...
 
typedef CVectorWrap< TWordTLIdMap
 Local id -> chunks map storage type. More...
 

Private Member Functions

void SetSeqDataFromMap (TWord **map)
 Set up the sequence store from the memory segment. More...
 

Private Attributes

TSubjects subjects_
 Mapping from database oids to the chunk info. More...
 
TSeqStore seq_store_
 Storage for the raw subject sequence data. More...
 
TWord total_
 Size in bytes of the raw sequence storage. More...
 
TChunks chunks_
 Collection of individual chunk descriptors. More...
 
unsigned long stride_
 Index stride value. More...
 
unsigned long min_offset_
 Minimum offset used by the index. More...
 
TLengths lengths_
 Subject lengths storage. More...
 
TLIdMap lid_map_
 Local id -> chunk map storage. More...
 
Uint1 offset_bits_
 Number of bits used to encode offset. More...
 
TWord offset_mask_
 Mask to extract offsets. More...
 
TSCPairMap c2s_map_
 CId -> (SId, RCId) map. More...
 
unsigned long max_chunk_size_
 
unsigned long chunk_overlap_
 

Detailed Description

Type representing subject map data.

Definition at line 1027 of file dbindex.hpp.

Member Typedef Documentation

◆ TChunks

Type for storing the chunk data.

For raw offset encoding the offset into the vector serves also as the internal logical sequence id.

Definition at line 1045 of file dbindex.hpp.

◆ TLengths

Subject lengths storage type.

Definition at line 1047 of file dbindex.hpp.

◆ TLIdMap

Local id -> chunks map storage type.

Definition at line 1048 of file dbindex.hpp.

◆ TOffsetValue

Definition at line 1033 of file dbindex.hpp.

◆ TSCPair

Definition at line 1261 of file dbindex.hpp.

◆ TSCPairMap

typedef vector< TSCPair > CSubjectMap::TSCPairMap

Definition at line 1262 of file dbindex.hpp.

◆ TSeqNum

Definition at line 1031 of file dbindex.hpp.

◆ TSeqStore

Type used for compressed subject sequence data storage.

Definition at line 1039 of file dbindex.hpp.

◆ TSOPair

Definition at line 1260 of file dbindex.hpp.

◆ TSubjects

Type used to map database oids to the chunk info.

Definition at line 1036 of file dbindex.hpp.

◆ TWord

Definition at line 1032 of file dbindex.hpp.

Constructor & Destructor Documentation

◆ CSubjectMap() [1/3]

CSubjectMap::CSubjectMap ( )
inline

Trivial constructor.

Definition at line 1053 of file dbindex.hpp.

◆ CSubjectMap() [2/3]

CSubjectMap::CSubjectMap ( TWord **  map,
TSeqNum  start,
TSeqNum  stop,
unsigned long  stride 
)

Constructs object by mapping to the memory segment.

Parameters
map[I/O] pointer to the memory segment
start[I] database oid of the first sequence in the map
stop[I] database oid of the last sequence in the map
stride[I] index stride value

Definition at line 525 of file dbindex.cpp.

References i, lengths_, lid_map_, Load(), offset_bits_, offset_mask_, and CVectorWrap< T >::SetPtr().

◆ CSubjectMap() [3/3]

CSubjectMap::CSubjectMap ( TWord **  map,
const SIndexHeader header 
)

Member Function Documentation

◆ DecodeOffset()

std::pair< TSeqNum, TSeqPos > CSubjectMap::DecodeOffset ( TWord  offset) const
inline

Decode offset.

Parameters
offsetThe encoded offset value.
Returns
A pair with first element being the local subject sequence id and the second element being the subject offset.

Definition at line 1149 of file dbindex.hpp.

References min_offset_, offset, offset_bits_, offset_mask_, and stride_.

Referenced by CSearch< LEGACY, NHITS >::DecodeOffset().

◆ getChunkLength()

TWord CSubjectMap::getChunkLength ( TSeqNum  cid) const
inline

◆ getCId()

TSeqNum CSubjectMap::getCId ( TSeqNum  sid,
TSeqNum  rcid 
) const
inline

Definition at line 1252 of file dbindex.hpp.

References ASSERT, chunks_, result, CVectorWrap< T >::size(), and subjects_.

Referenced by CDbIndex::getCId().

◆ getCIdByLRCId()

TSeqNum CSubjectMap::getCIdByLRCId ( TSeqNum  lid,
TSeqNum  rcid 
) const
inline

Definition at line 1290 of file dbindex.hpp.

References ASSERT, lid_map_, and CVectorWrap< T >::size().

Referenced by CDbIndex::getCIdByLRCId().

◆ getLId()

TSeqNum CSubjectMap::getLId ( const TOffsetValue v) const
inline

Definition at line 1356 of file dbindex.hpp.

References CDbIndex::SOffsetValue::offset, and offset_bits_.

Referenced by CDbIndex::getLId().

◆ getLOff()

TSeqPos CSubjectMap::getLOff ( const TOffsetValue v) const
inline

Definition at line 1359 of file dbindex.hpp.

References CDbIndex::SOffsetValue::offset, offset_mask_, and stride_.

Referenced by CDbIndex::getLOff().

◆ getNumChunks() [1/2]

TSeqNum CSubjectMap::getNumChunks ( ) const
inline

Definition at line 1337 of file dbindex.hpp.

References chunks_, and CVectorWrap< T >::size().

Referenced by CDbIndex::getNumChunks().

◆ GetNumChunks()

TSeqNum CSubjectMap::GetNumChunks ( TSeqNum  lid) const
inline

Get number of chunks combined into a given logical sequence.

Parameters
lidThe logical sequence id.
Returns
Corresponding number of chunks.

Definition at line 1110 of file dbindex.hpp.

References lid_map_.

Referenced by CSearch_Base< LEGACY, NHITS, derived_t >::operator()(), and CTrackedSeeds_Base< TWO_HIT >::SetLId().

◆ getNumChunks() [2/2]

TSeqNum CSubjectMap::getNumChunks ( TSeqNum  sid) const
inline

Definition at line 1339 of file dbindex.hpp.

References ASSERT, chunks_, CVectorWrap< T >::size(), and subjects_.

◆ getNumSubjects()

TSeqNum CSubjectMap::getNumSubjects ( ) const
inline

Definition at line 1336 of file dbindex.hpp.

References CVectorWrap< T >::size(), and subjects_.

Referenced by CDbIndex::getNumSubjects().

◆ getRCIdOffByLIdOff()

TSOPair CSubjectMap::getRCIdOffByLIdOff ( TSeqNum  lid,
TSeqPos  loff 
) const
inline

◆ GetSeqData()

const Uint1* CSubjectMap::GetSeqData ( TSeqNum  oid) const
inline

Get the sequence data of the subject sequence.

Parameters
oidOrdinal id of the subject sequence.
Returns
Pointer to the sequence data.

Definition at line 1239 of file dbindex.hpp.

References chunks_, seq_store_, and subjects_.

Referenced by CDbIndex_Impl< LEGACY >::GetSeqData().

◆ getSeqData()

const Uint1* CSubjectMap::getSeqData ( TSeqNum  sid) const
inline

Definition at line 1348 of file dbindex.hpp.

References ASSERT, chunks_, seq_store_, CVectorWrap< T >::size(), and subjects_.

Referenced by CDbIndex::getSeqData().

◆ GetSeqLen()

TSeqPos CSubjectMap::GetSeqLen ( TSeqNum  oid) const
inline

Get the length of the subject sequence.

Parameters
oidOrdinal id of the subject sequence.
Returns
Length of the sequence in bases.

Definition at line 1230 of file dbindex.hpp.

References lengths_.

Referenced by CDbIndex_Impl< LEGACY >::GetSeqLen().

◆ GetSeqStoreBase()

const Uint1* CSubjectMap::GetSeqStoreBase ( ) const
inline

Return the start of the raw storage for compressed subject sequence data.

Returns
start of the sequence data storage

Definition at line 1091 of file dbindex.hpp.

References seq_store_.

Referenced by CDbIndex_Impl< LEGACY >::GetSeqStoreBase().

◆ GetSeqStoreSize()

TWord CSubjectMap::GetSeqStoreSize ( ) const
inline

Return the size in bytes of the eaw sequence storage.

Returns
Size of the sequence data storage.

Definition at line 1097 of file dbindex.hpp.

References total_.

◆ getSOff()

TSeqPos CSubjectMap::getSOff ( TSeqNum  sid,
TSeqNum  rcid,
TSeqPos  coff 
) const
inline

◆ getSRCId()

TSCPair CSubjectMap::getSRCId ( TSeqNum  cid) const
inline

Definition at line 1264 of file dbindex.hpp.

References ASSERT, c2s_map_, chunks_, and CVectorWrap< T >::size().

Referenced by getChunkLength(), and CDbIndex::getSRCId().

◆ GetStride()

unsigned long CSubjectMap::GetStride ( ) const
inline

Accessor for stride value.

Returns
the stride value used by the index

Definition at line 1140 of file dbindex.hpp.

References stride_.

◆ getSubjectLength()

TWord CSubjectMap::getSubjectLength ( TSeqNum  sid) const
inline

Definition at line 1246 of file dbindex.hpp.

References ASSERT, lengths_, CVectorWrap< T >::size(), and subjects_.

Referenced by getChunkLength(), and CDbIndex::getSubjectLength().

◆ GetSubjectMap()

const TWord* CSubjectMap::GetSubjectMap ( ) const
inline

Provides a mapping from real subject ids and chunk numbers to internal logical subject ids.

Returns
start of the (subject,chunk)->id mapping

Definition at line 1085 of file dbindex.hpp.

References subjects_.

Referenced by CSearch_Base< LEGACY, NHITS, derived_t >::operator()().

◆ Load()

void CSubjectMap::Load ( TWord **  map,
TSeqNum  start,
TSeqNum  stop,
unsigned long  stride 
)

Loads index by mapping to the memory segment.

Parameters
map[I/O] pointer to the memory segment
start[I] database oid of the first sequence in the map
stop[I] database oid of the last sequence in the map
stride[I] index stride value

Definition at line 450 of file dbindex.cpp.

References c2s_map_, chunks_, GetMinOffset(), i, min_offset_, CVectorWrap< T >::SetPtr(), SetSeqDataFromMap(), CVectorWrap< T >::size(), stride_, subjects_, and total_.

Referenced by CSubjectMap().

◆ MapLId2Chunk()

TSeqNum CSubjectMap::MapLId2Chunk ( TSeqNum  lid,
TSeqNum  lchunk 
) const
inline

Map logical id and relative chunk to absolute chunk id.

Parameters
lidlogical sequence id
lchunkchunk number within the logical sequence
Returns
chunk id of the corresponding chunk

Definition at line 1211 of file dbindex.hpp.

References lid_map_.

◆ MapSubject()

TSeqNum CSubjectMap::MapSubject ( TSeqNum  subject,
TSeqNum  chunk 
) const
inline

Get the logical sequence id from the database oid and the chunk number.

Parameters
subject[I] database oid
chunk[I] the chunk number
Returns
logical sequence id corresponding to subject and chunk

Definition at line 1122 of file dbindex.hpp.

References chunks_, result, CVectorWrap< T >::size(), subject, and subjects_.

◆ MapSubjOff()

std::pair< TSeqNum, TSeqPos > CSubjectMap::MapSubjOff ( TSeqNum  lid,
TSeqPos  soff 
) const
inline

Map logical sequence id and logical sequence offset to relative chunk number and chunk offset.

Parameters
lidThe logical sequence id.
soffThe logical sequence offset.
Returns
Pair of relative chunk number and chunk offset.

Definition at line 1180 of file dbindex.hpp.

References ASSERT, CVectorWrap< T >::begin(), chunks_, CDbIndex::CR, CR, and lid_map_.

◆ NumChunks()

TSeqNum CSubjectMap::NumChunks ( ) const
inline

Get the total number of sequence chunks in the map.

Returns
number of chunks in the map

Definition at line 1102 of file dbindex.hpp.

References chunks_, and CVectorWrap< T >::size().

Referenced by CDbIndex_Impl< LEGACY >::NumChunks().

◆ NumSubjects()

TSeqNum CSubjectMap::NumSubjects ( ) const
inline

Get the total number of logical sequences in the map.

Returns
number of chunks in the map

Definition at line 1221 of file dbindex.hpp.

References lid_map_, and CVectorWrap< T >::size().

Referenced by CDbIndex_Impl< LEGACY >::NumSubjects().

◆ SetSeqDataFromMap()

void CSubjectMap::SetSeqDataFromMap ( TWord **  map)
private

Set up the sequence store from the memory segment.

Parameters
map[I/O] points to the memory segment

Definition at line 489 of file dbindex.cpp.

References seq_store_, CVectorWrap< T >::SetPtr(), and total_.

Referenced by Load().

◆ SetSubjInfo()

void CSubjectMap::SetSubjInfo ( TSeqNum  subj,
TWord start,
TWord end 
) const
inline

Return the subject information based on the given logical subject id.

Parameters
subj[I] logical subject id
start[0] starting offset of subj in the sequence store
end[0] 1 + ending offset of subj in the sequence store

Definition at line 1164 of file dbindex.hpp.

References lid_map_.

Referenced by CSearch< LEGACY, NHITS >::SetSubjInfo().

Member Data Documentation

◆ c2s_map_

TSCPairMap CSubjectMap::c2s_map_
private

CId -> (SId, RCId) map.

Definition at line 1383 of file dbindex.hpp.

Referenced by getSRCId(), and Load().

◆ chunk_overlap_

unsigned long CSubjectMap::chunk_overlap_
private

Definition at line 1386 of file dbindex.hpp.

Referenced by CSubjectMap(), getChunkLength(), and getSOff().

◆ chunks_

TChunks CSubjectMap::chunks_
private

Collection of individual chunk descriptors.

Definition at line 1374 of file dbindex.hpp.

Referenced by getChunkLength(), getCId(), getNumChunks(), getRCIdOffByLIdOff(), GetSeqData(), getSeqData(), getSOff(), getSRCId(), Load(), MapSubject(), MapSubjOff(), and NumChunks().

◆ lengths_

TLengths CSubjectMap::lengths_
private

Subject lengths storage.

Definition at line 1379 of file dbindex.hpp.

Referenced by CSubjectMap(), GetSeqLen(), getSOff(), and getSubjectLength().

◆ lid_map_

TLIdMap CSubjectMap::lid_map_
private

Local id -> chunk map storage.

Definition at line 1380 of file dbindex.hpp.

Referenced by CSubjectMap(), getCIdByLRCId(), GetNumChunks(), getRCIdOffByLIdOff(), MapLId2Chunk(), MapSubjOff(), NumSubjects(), and SetSubjInfo().

◆ max_chunk_size_

unsigned long CSubjectMap::max_chunk_size_
private

Definition at line 1385 of file dbindex.hpp.

Referenced by CSubjectMap(), getChunkLength(), and getSOff().

◆ min_offset_

unsigned long CSubjectMap::min_offset_
private

Minimum offset used by the index.

Definition at line 1377 of file dbindex.hpp.

Referenced by DecodeOffset(), and Load().

◆ offset_bits_

Uint1 CSubjectMap::offset_bits_
private

Number of bits used to encode offset.

Definition at line 1381 of file dbindex.hpp.

Referenced by CSubjectMap(), DecodeOffset(), and getLId().

◆ offset_mask_

TWord CSubjectMap::offset_mask_
private

Mask to extract offsets.

Definition at line 1382 of file dbindex.hpp.

Referenced by CSubjectMap(), DecodeOffset(), and getLOff().

◆ seq_store_

TSeqStore CSubjectMap::seq_store_
private

Storage for the raw subject sequence data.

Definition at line 1370 of file dbindex.hpp.

Referenced by getRCIdOffByLIdOff(), GetSeqData(), getSeqData(), GetSeqStoreBase(), and SetSeqDataFromMap().

◆ stride_

unsigned long CSubjectMap::stride_
private

Index stride value.

Definition at line 1376 of file dbindex.hpp.

Referenced by DecodeOffset(), getLOff(), GetStride(), and Load().

◆ subjects_

TSubjects CSubjectMap::subjects_
private

Mapping from database oids to the chunk info.

Definition at line 1369 of file dbindex.hpp.

Referenced by getChunkLength(), getCId(), getNumChunks(), getNumSubjects(), GetSeqData(), getSeqData(), getSOff(), getSubjectLength(), GetSubjectMap(), Load(), and MapSubject().

◆ total_

TWord CSubjectMap::total_
private

Size in bytes of the raw sequence storage.

(only valid after the complete object has been constructed)

Definition at line 1371 of file dbindex.hpp.

Referenced by GetSeqStoreSize(), Load(), and SetSeqDataFromMap().


The documentation for this class was generated from the following files:
Modified on Fri Sep 20 14:57:44 2024 by modify_doxy.py rev. 669887