NCBI C++ ToolKit
|
Search Toolkit Book for CSeqMaskerOstatOpt
Class responsible for collecting unit counts statistics and representing it in optimized hash-based format. More...
#include <algo/winmask/seq_masker_ostat_opt.hpp>
Classes | |
class | Exception |
Exceptions that CSeqMaskerOstatOpt might throw. More... | |
struct | params |
Parameters of the optimized data structure. More... | |
Public Member Functions | |
CSeqMaskerOstatOpt (CNcbiOstream &os, Uint2 sz, bool alloc, string const &metadata) | |
Object constructor. More... | |
virtual | ~CSeqMaskerOstatOpt () |
Object destructor. More... | |
Public Member Functions inherited from CSeqMaskerOstat | |
CSeqMaskerOstat (CNcbiOstream &os, bool alloc, string const &metadata) | |
Object constructor. More... | |
virtual | ~CSeqMaskerOstat () |
Trivial object destructor. More... | |
void | setUnitSize (Uint1 us) |
Set the unit size value. More... | |
void | setUnitCount (Uint4 unit, Uint4 count) |
Add count value for a particular unit. More... | |
void | setComment (const string &msg) |
Add a comment to the unit counts file. More... | |
void | setParam (const string &name, Uint4 value) |
Set a value of a WindowMasker parameter. More... | |
void | finalize () |
Perform any final tasks required to generate unit counts in the particular format. More... | |
void | SetStatAlgoVersion (CSeqMaskerVersion const &v) |
Set the counts generation algorithm version explicitly (needed for convertions). More... | |
virtual CSeqMaskerVersion const & | GetStatFmtVersion () const =0 |
Get actual counts format version. More... | |
void | SetMaxCount (Uint4 mc) |
void | SetCount (Uint4 count, double pct) |
Public Member Functions inherited from CObject | |
CObject (void) | |
Constructor. More... | |
CObject (const CObject &src) | |
Copy constructor. More... | |
virtual | ~CObject (void) |
Destructor. More... | |
CObject & | operator= (const CObject &src) THROWS_NONE |
Assignment operator. More... | |
bool | CanBeDeleted (void) const THROWS_NONE |
Check if object can be deleted. More... | |
bool | IsAllocatedInPool (void) const THROWS_NONE |
Check if object is allocated in memory pool (not system heap) More... | |
bool | Referenced (void) const THROWS_NONE |
Check if object is referenced. More... | |
bool | ReferencedOnlyOnce (void) const THROWS_NONE |
Check if object is referenced only once. More... | |
void | AddReference (void) const |
Add reference to object. More... | |
void | RemoveReference (void) const |
Remove reference to object. More... | |
void | ReleaseReference (void) const |
Remove reference without deleting object. More... | |
virtual void | DoNotDeleteThisObject (void) |
Mark this object as not allocated in heap – do not delete this object. More... | |
virtual void | DoDeleteThisObject (void) |
Mark this object as allocated in heap – object can be deleted. More... | |
void * | operator new (size_t size) |
Define new operator for memory allocation. More... | |
void * | operator new[] (size_t size) |
Define new[] operator for 'array' memory allocation. More... | |
void | operator delete (void *ptr) |
Define delete operator for memory deallocation. More... | |
void | operator delete[] (void *ptr) |
Define delete[] operator for memory deallocation. More... | |
void * | operator new (size_t size, void *place) |
Define new operator. More... | |
void | operator delete (void *ptr, void *place) |
Define delete operator. More... | |
void * | operator new (size_t size, CObjectMemoryPool *place) |
Define new operator using memory pool. More... | |
void | operator delete (void *ptr, CObjectMemoryPool *place) |
Define delete operator. More... | |
virtual void | DebugDump (CDebugDumpContext ddc, unsigned int depth) const |
Define method for dumping debug information. More... | |
Public Member Functions inherited from CDebugDumpable | |
CDebugDumpable (void) | |
virtual | ~CDebugDumpable (void) |
void | DebugDumpText (ostream &out, const string &bundle, unsigned int depth) const |
void | DebugDumpFormat (CDebugDumpFormatter &ddf, const string &bundle, unsigned int depth) const |
void | DumpToConsole (void) const |
Protected Member Functions | |
virtual void | write_out (const params &p) const =0 |
Dump the unit counts data to the output stream according to the requested format. More... | |
Uint1 | UnitSize () const |
Get the unit size value in bases. More... | |
const vector< Uint4 > & | GetParams () const |
Get the values of masking parameters. More... | |
virtual void | doSetUnitSize (Uint4 us) |
Set the unit size value. More... | |
virtual void | doSetUnitCount (Uint4 unit, Uint4 count) |
Set count information for the given unit. More... | |
virtual void | doFinalize () |
Generate a hash function and dump the optimized unit counts data to the output stream. More... | |
Protected Member Functions inherited from CSeqMaskerOstat | |
string | FormatParameters () const |
Format algorithm parameters into a string. More... | |
string | FormatMetaData () const |
Combine version data and metadata into a single string. More... | |
void | WriteBinMetaData (std::ostream &os) const |
Write metadata in binary format. More... | |
virtual void | doSetComment (const string &) |
virtual void | doSetParam (const string &, Uint4) |
Protected Member Functions inherited from CObject | |
virtual void | DeleteThis (void) |
Virtual method "deleting" this object. More... | |
Private Member Functions | |
Uint1 | findBestRoff (Uint1 k, Uint1 &max_coll, Uint4 &M, Uint4 *ht) |
void | createCacheBitArray (Uint4 **cba) |
Private Attributes | |
Uint2 | size_requested |
Uint1 | unit_bit_size |
vector< Uint4 > | units |
vector< Uint2 > | counts |
Additional Inherited Members | |
Public Types inherited from CObject | |
enum | EAllocFillMode { eAllocFillNone = 1 , eAllocFillZero , eAllocFillPattern } |
Control filling of newly allocated memory. More... | |
typedef CObjectCounterLocker | TLockerType |
Default locker type for CRef. More... | |
typedef atomic< Uint8 > | TCounter |
Counter type is CAtomiCounter. More... | |
typedef Uint8 | TCount |
Alias for value type of counter. More... | |
Static Public Member Functions inherited from CObject | |
static NCBI_XNCBI_EXPORT void | ThrowNullPointerException (void) |
Define method to throw null pointer exception. More... | |
static NCBI_XNCBI_EXPORT void | ThrowNullPointerException (const type_info &type) |
static EAllocFillMode | GetAllocFillMode (void) |
static void | SetAllocFillMode (EAllocFillMode mode) |
static void | SetAllocFillMode (const string &value) |
Set mode from configuration parameter value. More... | |
Static Public Member Functions inherited from CDebugDumpable | |
static void | EnableDebugDump (bool on) |
Static Public Attributes inherited from CSeqMaskerOstat | |
static char const * | STAT_ALGO_COMPONENT_NAME |
static CSeqMaskerVersion | StatAlgoVersion |
Version of the statistics generation algorithm. More... | |
Static Public Attributes inherited from CObject | |
static const TCount | eCounterBitsCanBeDeleted = 1 << 0 |
Define possible object states. More... | |
static const TCount | eCounterBitsInPlainHeap = 1 << 1 |
Heap signature was found. More... | |
static const TCount | eCounterBitsPlaceMask |
Mask for 'in heap' state flags. More... | |
static const int | eCounterStep = 1 << 2 |
Skip over the "in heap" bits. More... | |
static const TCount | eCounterValid = TCount(1) << (sizeof(TCount) * 8 - 2) |
Minimal value for valid objects (reference counter is zero) Must be a single bit value. More... | |
static const TCount | eCounterStateMask |
Valid object, and object in heap. More... | |
Protected Attributes inherited from CSeqMaskerOstat | |
CNcbiOstream & | out_stream |
Refers to the C++ stream that should be used to write out the unit counts data. More... | |
bool | alloc |
flag indicating that the stream was allocated More... | |
string | metadata |
metadata string More... | |
Uint1 | unit_size |
unit size More... | |
vector< Uint4 > | pvalues |
vector< pair< Uint4, Uint4 > > | counts |
Unit counts. More... | |
CSeqMaskerVersion | fmt_gen_algo_ver |
version of the algorithm used to generate counts More... | |
Uint4 | max_count = 0 |
information about counts and corresponding pct cutoffs More... | |
std::vector< double > | count_map |
Static Protected Attributes inherited from CSeqMaskerOstat | |
static const char * | PARAMS [] = { "t_low", "t_extend", "t_threshold", "t_high" } |
Algorithm parameter names. More... | |
Class responsible for collecting unit counts statistics and representing it in optimized hash-based format.
Definition at line 48 of file seq_masker_ostat_opt.hpp.
|
explicit |
Object constructor.
os | output stream object, forwarded to CSeqMaskerOstream base |
sz | requested size of the unit counts file in megabytes |
alloc | flag to indicate that the stream was allocated |
Definition at line 61 of file seq_masker_ostat_opt.cpp.
|
inlinevirtual |
Object destructor.
Definition at line 86 of file seq_masker_ostat_opt.hpp.
|
private |
Definition at line 120 of file seq_masker_ostat_opt.cpp.
References _TRACE, counts, ERR_POST, i, CSeqMaskerUtil::reverse_complement(), ncbi::grid::netcache::search::fields::size, unit_bit_size, units, and Warning().
Referenced by doFinalize().
|
protectedvirtual |
Generate a hash function and dump the optimized unit counts data to the output stream.
Reimplemented from CSeqMaskerOstat.
Definition at line 212 of file seq_masker_ostat_opt.cpp.
References _TRACE, counts, createCacheBitArray(), findBestRoff(), AutoPtr< X, Del >::get(), CSeqMaskerUtil::hash_code(), i, LOG_POST, M, MB, NCBI_THROW, AutoPtr< X, Del >::reset(), size_requested, unit_bit_size, units, and write_out().
Set count information for the given unit.
unit | the unit |
count | the number of times the unit and its reverse complement appears in the genome |
Implements CSeqMaskerOstat.
Definition at line 82 of file seq_masker_ostat_opt.cpp.
References count, counts, GROW_CHUNK, max(), and units.
|
protectedvirtual |
Set the unit size value.
us | the unit size |
Reimplemented from CSeqMaskerOstat.
Definition at line 75 of file seq_masker_ostat_opt.cpp.
References CSeqMaskerOstat::doSetUnitSize(), and unit_bit_size.
Definition at line 163 of file seq_masker_ostat_opt.cpp.
References CSeqMaskerUtil::hash_code(), i, l(), M, t, unit_bit_size, and units.
Referenced by doFinalize().
Get the values of masking parameters.
Masking parameters is a vector of 4 integers representing the values of T_low, T_extend, T_threshold, and T_high.
Definition at line 71 of file seq_masker_ostat_opt.cpp.
References CSeqMaskerOstat::pvalues.
Referenced by CSeqMaskerOstatOptBin::write_out(), and CSeqMaskerOstatOptAscii::write_out().
|
protected |
Get the unit size value in bases.
Definition at line 67 of file seq_masker_ostat_opt.cpp.
References unit_bit_size.
Referenced by CSeqMaskerOstatOptBin::write_out(), and CSeqMaskerOstatOptAscii::write_out().
Dump the unit counts data to the output stream according to the requested format.
Derived classes should override this function to format the data.
p | data structure parameters |
Implemented in CSeqMaskerOstatOptAscii, and CSeqMaskerOstatOptBin.
Referenced by doFinalize().
|
private |
Definition at line 177 of file seq_masker_ostat_opt.hpp.
Referenced by createCacheBitArray(), doFinalize(), and doSetUnitCount().
|
private |
Definition at line 173 of file seq_masker_ostat_opt.hpp.
Referenced by doFinalize().
|
private |
Definition at line 174 of file seq_masker_ostat_opt.hpp.
Referenced by createCacheBitArray(), doFinalize(), doSetUnitSize(), findBestRoff(), and UnitSize().
|
private |
Definition at line 176 of file seq_masker_ostat_opt.hpp.
Referenced by createCacheBitArray(), doFinalize(), doSetUnitCount(), and findBestRoff().