NCBI C++ ToolKit
Public Member Functions | Private Member Functions | Private Attributes | List of all members
CBlastKmer Class Reference

Search Toolkit Book for CBlastKmer

Class to perform a KMER-BLASTP search. More...

#include <algo/blast/proteinkmer/blastkmer.hpp>

+ Inheritance diagram for CBlastKmer:
+ Collaboration diagram for CBlastKmer:

Public Member Functions

 CBlastKmer (TSeqLocVector &query_vector, CRef< CBlastKmerOptions > options, CRef< CSeqDB > seqdb, string kmerfile=kEmptyStr)
 Constructor Processes all proteins in TSeqLocVector. More...
 
 CBlastKmer (SSeqLoc &query, CRef< CBlastKmerOptions > options, const string &dbname)
 Constructor. More...
 
 ~CBlastKmer ()
 Destructor. More...
 
CRef< CBlastKmerResultsSetRun ()
 Performs search on one or more queries Performs search on one or more queries. More...
 
CRef< CBlastKmerResultsSetRunSearches ()
 
void SetGiListLimit (CRef< CSeqDBGiList > list)
 Limits output by GILIST. More...
 
void SetGiListLimit (CRef< CSeqDBNegativeList > list)
 Limits output by negative GILIST. More...
 
- Public Member Functions inherited from CObject
 CObject (void)
 Constructor. More...
 
 CObject (const CObject &src)
 Copy constructor. More...
 
virtual ~CObject (void)
 Destructor. More...
 
CObjectoperator= (const CObject &src) THROWS_NONE
 Assignment operator. More...
 
bool CanBeDeleted (void) const THROWS_NONE
 Check if object can be deleted. More...
 
bool IsAllocatedInPool (void) const THROWS_NONE
 Check if object is allocated in memory pool (not system heap) More...
 
bool Referenced (void) const THROWS_NONE
 Check if object is referenced. More...
 
bool ReferencedOnlyOnce (void) const THROWS_NONE
 Check if object is referenced only once. More...
 
void AddReference (void) const
 Add reference to object. More...
 
void RemoveReference (void) const
 Remove reference to object. More...
 
void ReleaseReference (void) const
 Remove reference without deleting object. More...
 
virtual void DoNotDeleteThisObject (void)
 Mark this object as not allocated in heap – do not delete this object. More...
 
virtual void DoDeleteThisObject (void)
 Mark this object as allocated in heap – object can be deleted. More...
 
void * operator new (size_t size)
 Define new operator for memory allocation. More...
 
void * operator new[] (size_t size)
 Define new[] operator for 'array' memory allocation. More...
 
void operator delete (void *ptr)
 Define delete operator for memory deallocation. More...
 
void operator delete[] (void *ptr)
 Define delete[] operator for memory deallocation. More...
 
void * operator new (size_t size, void *place)
 Define new operator. More...
 
void operator delete (void *ptr, void *place)
 Define delete operator. More...
 
void * operator new (size_t size, CObjectMemoryPool *place)
 Define new operator using memory pool. More...
 
void operator delete (void *ptr, CObjectMemoryPool *place)
 Define delete operator. More...
 
virtual void DebugDump (CDebugDumpContext ddc, unsigned int depth) const
 Define method for dumping debug information. More...
 
- Public Member Functions inherited from CDebugDumpable
 CDebugDumpable (void)
 
virtual ~CDebugDumpable (void)
 
void DebugDumpText (ostream &out, const string &bundle, unsigned int depth) const
 
void DebugDumpFormat (CDebugDumpFormatter &ddf, const string &bundle, unsigned int depth) const
 
void DumpToConsole (void) const
 
- Public Member Functions inherited from CThreadable
 CThreadable (void)
 Default ctor. More...
 
virtual ~CThreadable (void)
 Our virtual destructor. More...
 
virtual void SetNumberOfThreads (size_t nthreads)
 Mutator for the number of threads. More...
 
size_t GetNumberOfThreads (void) const
 Accessor for the number of threads to use. More...
 
bool IsMultiThreaded (void) const
 Returns true if more than 1 thread is specified. More...
 

Private Member Functions

void x_ProcessQuery (const string &query_seq, SOneBlastKmerSearch &kmerSearch, const SBlastKmerParameters &kmerParams, uint32_t *a, uint32_t *b, vector< vector< int > > &kvalues, vector< int > badMers)
 Preprocess query to sequence hashes. More...
 
void x_RunKmerFile (const vector< vector< uint32_t > > &query_hash, const vector< vector< uint32_t > > &query_LSH_hash, CMinHashFile &mhfile, TBlastKmerPrelimScoreVector &score_vector, BlastKmerStats &kmer_stats)
 Search individual kmer file. More...
 
CRef< CBlastKmerResultsSetx_SearchMultipleQueries (int firstQuery, int numQuery, const SBlastKmerParameters &kmerParams, uint32_t *a, uint32_t *b, vector< vector< int > > &kValues, vector< int > badMers)
 Search multiple queries. More...
 

Private Attributes

TSeqLocVector m_QueryVector
 Holds the query seqloc and scope. More...
 
CRef< CBlastKmerOptionsm_Opts
 Specifies values for some options (e.g., threshold) More...
 
CRef< CSeqDBm_SeqDB
 CSeqDB for BLAST db. More...
 
vector< stringm_KmerFiles
 Name of the kmer files. More...
 
CRef< CSeqDBGiListm_GIList
 GIList to limit search by. More...
 
CRef< CSeqDBNegativeListm_NegGIList
 Negative GIList to limit search by. More...
 

Additional Inherited Members

- Public Types inherited from CObject
enum  EAllocFillMode { eAllocFillNone = 1 , eAllocFillZero , eAllocFillPattern }
 Control filling of newly allocated memory. More...
 
typedef CObjectCounterLocker TLockerType
 Default locker type for CRef. More...
 
typedef atomic< Uint8TCounter
 Counter type is CAtomiCounter. More...
 
typedef Uint8 TCount
 Alias for value type of counter. More...
 
- Public Types inherited from CThreadable
enum  { kMinNumThreads = 1 }
 Never have less than 1 thread. More...
 
- Static Public Member Functions inherited from CObject
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (void)
 Define method to throw null pointer exception. More...
 
static NCBI_XNCBI_EXPORT void ThrowNullPointerException (const type_info &type)
 
static EAllocFillMode GetAllocFillMode (void)
 
static void SetAllocFillMode (EAllocFillMode mode)
 
static void SetAllocFillMode (const string &value)
 Set mode from configuration parameter value. More...
 
- Static Public Member Functions inherited from CDebugDumpable
static void EnableDebugDump (bool on)
 
- Static Public Attributes inherited from CObject
static const TCount eCounterBitsCanBeDeleted = 1 << 0
 Define possible object states. More...
 
static const TCount eCounterBitsInPlainHeap = 1 << 1
 Heap signature was found. More...
 
static const TCount eCounterBitsPlaceMask
 Mask for 'in heap' state flags. More...
 
static const int eCounterStep = 1 << 2
 Skip over the "in heap" bits. More...
 
static const TCount eCounterValid = TCount(1) << (sizeof(TCount) * 8 - 2)
 Minimal value for valid objects (reference counter is zero) Must be a single bit value. More...
 
static const TCount eCounterStateMask
 Valid object, and object in heap. More...
 
- Protected Member Functions inherited from CObject
virtual void DeleteThis (void)
 Virtual method "deleting" this object. More...
 
- Protected Attributes inherited from CThreadable
size_t m_NumThreads
 Keep track of how many threads should be used. More...
 

Detailed Description

Class to perform a KMER-BLASTP search.

To run, first call the constructor, then RunSearches, then access results through CBlastKmerResultsSet. The Run method is deprecated and will be removed. A few notes/caveats: The CBlastKmerOptions constructor can be called and the resulting object used as input. The string kmerfile should be kEmptyStr so that the volumes for the database in the seqdb parameter will be used. If kEmptyStr is not used for the kmerfile, then, for nr, it would be "nr.00 nr.01 nr.02 nr.03" etc. These should correspond exactly to the database as they rely on the oids in the database. If the kmerfiles are derived from the BLAST database then the BLASTDB paths etc. will be respected for the kmerfiles. NOTE: recoverable errors (e.g., query shorter than KMER size) will NOT trigger an exception but the CBlastKmerResults for that query will have an error or warning. Use the HasError or HasWarning message to check.

Definition at line 71 of file blastkmer.hpp.

Constructor & Destructor Documentation

◆ CBlastKmer() [1/2]

CBlastKmer::CBlastKmer ( TSeqLocVector query_vector,
CRef< CBlastKmerOptions options,
CRef< CSeqDB seqdb,
string  kmerfile = kEmptyStr 
)

Constructor Processes all proteins in TSeqLocVector.

Parameters
query_vectorspecifes one protein query sequence [in]
optionssets kblastp parameters. [in]
seqdbCSeqDB pointer for database to be searched.
kmerfileassume same names as seqdb volumes if kEmptyStr [in]

Definition at line 43 of file blastkmer.cpp.

References eUnknown, and NCBI_THROW.

◆ CBlastKmer() [2/2]

CBlastKmer::CBlastKmer ( SSeqLoc query,
CRef< CBlastKmerOptions options,
const string dbname 
)

Constructor.

Parameters
queryspecifes one protein query sequence [in]
optionssets kblastp parameters. [in]
dbnamename of a BLAST database [in]

Definition at line 62 of file blastkmer.cpp.

References dbname(), CSeqDB::eProtein, eUnknown, CSeqDB::FindVolumePaths(), m_KmerFiles, m_QueryVector, m_SeqDB, NCBI_THROW, query, and CBlastKmerOptions::Validate().

◆ ~CBlastKmer()

CBlastKmer::~CBlastKmer ( )
inline

Destructor.

Definition at line 94 of file blastkmer.hpp.

Member Function Documentation

◆ Run()

CRef< CBlastKmerResultsSet > CBlastKmer::Run ( void  )

◆ RunSearches()

CRef< CBlastKmerResultsSet > CBlastKmer::RunSearches ( )
Deprecated:
Use Run method instead Performs search on one or more queries Just calls Run method.
Exceptions
CInputExceptionif the query has no KMERs

Definition at line 319 of file blastkmer.cpp.

References Run().

◆ SetGiListLimit() [1/2]

void CBlastKmer::SetGiListLimit ( CRef< CSeqDBGiList list)
inline

Limits output by GILIST.

Parameters
listCRef<CSeqDBGiList> to limit by [in]

Definition at line 110 of file blastkmer.hpp.

◆ SetGiListLimit() [2/2]

void CBlastKmer::SetGiListLimit ( CRef< CSeqDBNegativeList list)
inline

Limits output by negative GILIST.

Parameters
listCRef<CSeqDBNegativeList> to limit by [in]

Definition at line 114 of file blastkmer.hpp.

◆ x_ProcessQuery()

void CBlastKmer::x_ProcessQuery ( const string query_seq,
SOneBlastKmerSearch kmerSearch,
const SBlastKmerParameters kmerParams,
uint32_t a,
uint32_t b,
vector< vector< int > > &  kvalues,
vector< int badMers 
)
private

Preprocess query to sequence hashes.

Parameters
query_seqcontains query in ncbistdaa [in]
seq_hashAll hash values for query [out]
query_LSH_hashesLSH values [out]
num_hashesNumber of hash functions [in]
rows_per_band[in]
sampleshow many samples to check (Buhler) [in]
aArray of num_hash hash values [in]
bArray of num_hash hash values [in]
kmerNumsize of kmer [in]
alphabetChoice0 is 15 letter, 1 is 10 letter alphabet [in]
versionwhich version of the kmer index to use [in]
kvaluesBuhler LSH points [in]
badMersOverrepresented KMERs [in]
Exceptions
CInputExceptionif the query has no KMERs

Definition at line 79 of file blastkmer.cpp.

References a, SBlastKmerParameters::alphabetChoice, b, SBlastKmerParameters::chunkSize, eUnknown, get_LSH_hashes(), get_LSH_hashes2(), get_LSH_hashes5(), SBlastKmerParameters::kmerNum, minhash_query(), minhash_query2(), NCBI_THROW, SBlastKmerParameters::numHashes, SOneBlastKmerSearch::queryHash, SOneBlastKmerSearch::queryLSHHash, SBlastKmerParameters::rowsPerBand, SBlastKmerParameters::samples, and SBlastKmerParameters::version.

Referenced by x_SearchMultipleQueries().

◆ x_RunKmerFile()

void CBlastKmer::x_RunKmerFile ( const vector< vector< uint32_t > > &  query_hash,
const vector< vector< uint32_t > > &  query_LSH_hash,
CMinHashFile mhfile,
TBlastKmerPrelimScoreVector score_vector,
BlastKmerStats kmer_stats 
)
private

Search individual kmer file.

Parameters
query_hashAll hash values for query [in]
query_LSH_hashesLSH values [in]
filenamebasename of kmer files. [in]
score_vectorresults vector [out]
kmer_statsancillary information about run [out]

Definition at line 105 of file blastkmer.cpp.

References get_LSH_match_from_hash(), CMinHashFile::GetAlphabet(), CMinHashFile::GetLSHArray(), CBlastKmerOptions::GetMinHits(), CMinHashFile::GetNumHashes(), CMinHashFile::GetNumSeqs(), CBlastKmerOptions::GetThresh(), CMinHashFile::GetVersion(), m_Opts, neighbor_query(), and BlastKmerStats::num_sequences.

Referenced by x_SearchMultipleQueries().

◆ x_SearchMultipleQueries()

CRef< CBlastKmerResultsSet > CBlastKmer::x_SearchMultipleQueries ( int  firstQuery,
int  numQuery,
const SBlastKmerParameters kmerParams,
uint32_t a,
uint32_t b,
vector< vector< int > > &  kValues,
vector< int badMers 
)
private

Member Data Documentation

◆ m_GIList

CRef<CSeqDBGiList> CBlastKmer::m_GIList
private

GIList to limit search by.

Definition at line 170 of file blastkmer.hpp.

Referenced by x_SearchMultipleQueries().

◆ m_KmerFiles

vector<string> CBlastKmer::m_KmerFiles
private

Name of the kmer files.

Definition at line 167 of file blastkmer.hpp.

Referenced by CBlastKmer(), Run(), and x_SearchMultipleQueries().

◆ m_NegGIList

CRef<CSeqDBNegativeList> CBlastKmer::m_NegGIList
private

Negative GIList to limit search by.

Only one of the gilist or negative GIlist should be set.

Definition at line 174 of file blastkmer.hpp.

Referenced by x_SearchMultipleQueries().

◆ m_Opts

CRef<CBlastKmerOptions> CBlastKmer::m_Opts
private

Specifies values for some options (e.g., threshold)

Definition at line 161 of file blastkmer.hpp.

Referenced by x_RunKmerFile(), and x_SearchMultipleQueries().

◆ m_QueryVector

TSeqLocVector CBlastKmer::m_QueryVector
private

Holds the query seqloc and scope.

Definition at line 158 of file blastkmer.hpp.

Referenced by CBlastKmer(), Run(), and x_SearchMultipleQueries().

◆ m_SeqDB

CRef<CSeqDB> CBlastKmer::m_SeqDB
private

CSeqDB for BLAST db.

Definition at line 164 of file blastkmer.hpp.

Referenced by CBlastKmer(), and x_SearchMultipleQueries().


The documentation for this class was generated from the following files:
Modified on Tue Dec 05 02:08:01 2023 by modify_doxy.py rev. 669887