NCBI C++ ToolKit
|
Search Toolkit Book for CGeneInfoFileReader
#include <objtools/blast/gene_info_reader/gene_info_reader.hpp>
Public Member Functions | |
CGeneInfoFileReader (const string &strGi2GeneFile, const string &strGene2OffsetFile, const string &strGi2OffsetFile, const string &strAllGeneDataFile, const string &strGene2GiFile, bool bGiToOffsetLookup=true) | |
Construct using direct paths. More... | |
CGeneInfoFileReader (bool bGiToOffsetLookup=true) | |
Construct using paths read from an environment variable. More... | |
virtual | ~CGeneInfoFileReader () |
Destructor. More... | |
virtual bool | GetGeneIdsForGi (TGi gi, TGeneIdList &geneIdList) |
GetGeneIdsForGi implementation, see IGeneInfoInput. More... | |
virtual bool | GetRNAGisForGeneId (int geneId, TGiList &giList) |
GetRNAGisForGeneId implementation, see IGeneInfoInput. More... | |
virtual bool | GetProteinGisForGeneId (int geneId, TGiList &giList) |
GetProteinGisForGeneId implementation, see IGeneInfoInput. More... | |
virtual bool | GetGenomicGisForGeneId (int geneId, TGiList &giList) |
GetGenomicGisForGeneId implementation, see IGeneInfoInput. More... | |
virtual bool | GetGeneInfoForGi (TGi gi, TGeneInfoList &infoList) |
GetGeneInfoForGi implementation, see IGeneInfoInput. More... | |
virtual bool | GetGeneInfoForId (int geneId, TGeneInfoList &infoList) |
GetGeneInfoForId implementation, see IGeneInfoInput. More... | |
Public Member Functions inherited from IGeneInfoInput | |
virtual | ~IGeneInfoInput () |
Destructor. More... | |
Private Member Functions | |
void | x_MapMemFiles () |
Memory-map all the files. More... | |
void | x_UnmapMemFiles () |
Unmap all the memory-mapped files. More... | |
bool | x_GiToGeneId (TGi gi, list< int > &listGeneIds) |
Fill the Gene ID list given a Gi. More... | |
bool | x_GeneIdToOffset (int geneId, int &nOffset) |
Set the offset value given a Gene ID. More... | |
bool | x_GiToOffset (TGi gi, list< int > &listOffsets) |
Set the offset value given a Gi. More... | |
bool | x_GeneIdToGi (int geneId, int iGiField, list< TGi > &listGis) |
Fill the Gi list given a Gene ID, and the Gi field index, which represents the Gi type to be read from the file. More... | |
bool | x_OffsetToInfo (int nOffset, CRef< CGeneInfo > &info) |
Read Gene data at the given offset and create the info object. More... | |
Private Attributes | |
string | m_strGi2GeneFile |
Path to the Gi to Gene ID file. More... | |
string | m_strGene2OffsetFile |
Path to the Gene ID to Offset file. More... | |
string | m_strGi2OffsetFile |
Path to the Gi to Offset file. More... | |
string | m_strGene2GiFile |
Path to the Gene ID to Gi file. More... | |
string | m_strAllGeneDataFile |
Path to the file containing all the Gene data. More... | |
bool | m_bGiToOffsetLookup |
Perform Gi to Offset lookups directly. More... | |
unique_ptr< CMemoryFile > | m_memGi2GeneFile |
Memory-mapped Gi to Gene ID file. More... | |
unique_ptr< CMemoryFile > | m_memGene2OffsetFile |
Memory-mapped Gene ID to Offset file. More... | |
unique_ptr< CMemoryFile > | m_memGi2OffsetFile |
Memory-mapped Gi to Offset file. More... | |
unique_ptr< CMemoryFile > | m_memGene2GiFile |
Memory-mapped Gene ID to Gi file. More... | |
CNcbiIfstream | m_inAllData |
Input stream for the Gene data file. More... | |
TGeneIdToGeneInfoMap | m_mapIdToInfo |
Cached map of looked up Gene Info objects. More... | |
Additional Inherited Members | |
Public Types inherited from IGeneInfoInput | |
typedef list< TGi > | TGiList |
List of Gis. More... | |
typedef list< int > | TGeneIdList |
List of Gene IDs. More... | |
typedef map< int, CRef< CGeneInfo > > | TGeneIdToGeneInfoMap |
Gene ID to Gene Information map. More... | |
typedef vector< CRef< CGeneInfo > > | TGeneInfoList |
List of Gene Information objects. More... | |
Static Public Member Functions inherited from CGeneFileUtils | |
static bool | CheckDirExistence (const string &strDir) |
Check if a directory exists, given its name. More... | |
static bool | CheckExistence (const string &strFile) |
Check if a file exists, given its name. More... | |
static Int8 | GetLength (const string &strFile) |
Get the length of a file, given its name. More... | |
static bool | OpenTextInputFile (const string &strFileName, CNcbiIfstream &in) |
Open the given text file for reading. More... | |
static bool | OpenBinaryInputFile (const string &strFileName, CNcbiIfstream &in) |
Open the given binary file for reading. More... | |
static bool | OpenTextOutputFile (const string &strFileName, CNcbiOfstream &out) |
Open the given text file for writing. More... | |
static bool | OpenBinaryOutputFile (const string &strFileName, CNcbiOfstream &out) |
Open the given binary file for writing. More... | |
static void | WriteRecord (CNcbiOfstream &out, STwoIntRecord &record) |
Write a pair of integers to the file. More... | |
static void | ReadRecord (CNcbiIfstream &in, STwoIntRecord &record) |
Read a pair of integers from the file. More... | |
template<int k_nFields> | |
static void | WriteRecord (CNcbiOfstream &out, SMultiIntRecord< k_nFields > &record) |
Write an n-tuple of integers to the file. More... | |
template<int k_nFields> | |
static void | ReadRecord (CNcbiIfstream &in, SMultiIntRecord< k_nFields > &record) |
Read an n-tuple of integers from the file. More... | |
static void | WriteGeneInfo (CNcbiOfstream &out, CRef< CGeneInfo > info, int &nCurrentOffset) |
Write a Gene info object to the file. More... | |
static void | ReadGeneInfo (CNcbiIfstream &in, int nOffset, CRef< CGeneInfo > &info) |
Read a Gene info object from the file. More... | |
Class implementing the IGeneInfoInput interface using binary files.
CGeneInfoFileReader reads and memory-maps sorted binary files for fast Gi to Gene ID, Gene ID to Gene Info, Gi to Gene Info, and Gene ID to Gi conversions. The Gene Info lookup is represented by two files, one contains (Gi, Offset) or (Gene ID, Offset) pairs, the other one contains all the Gene data. The lookup is performed in two steps: first, the offset to the Gene data is obtained, then the Gene data line is read, parsed, and the corresponding CGeneInfo object is constructed. The paths to the pre-computed and sorted files are either provided directly to the constructor, or the class attempts to read them from a path stored in an environment variable (the preferred approach).
Definition at line 85 of file gene_info_reader.hpp.
CGeneInfoFileReader::CGeneInfoFileReader | ( | const string & | strGi2GeneFile, |
const string & | strGene2OffsetFile, | ||
const string & | strGi2OffsetFile, | ||
const string & | strAllGeneDataFile, | ||
const string & | strGene2GiFile, | ||
bool | bGiToOffsetLookup = true |
||
) |
Construct using direct paths.
This version of the constructor takes the paths to the pre-computed binary files and attempts to open and map the files.
strGi2GeneFile | Path to the Gi to Gene ID file |
strGene2OffsetFile | Path to the Gene ID to Offset file. |
strGi2OffsetFile | Path to the Gi to Offset file. |
strAllGeneDataFile | Path to the Gene data file. |
strGene2GiFile | Path to the Gene ID to Gi file. |
bGiToOffsetLookup | Perform Gi to Offset lookups directly. |
Construct using paths read from an environment variable.
This version of the constructor reads the paths to the pre-computed binary files from an environment variable and attempts to open and map the files.
bGiToOffsetLookup | Perform Gi to Offset lookups directly. |
|
virtual |
Destructor.
|
virtual |
GetGeneIdsForGi implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
|
virtual |
GetGeneInfoForGi implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
|
virtual |
GetGeneInfoForId implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
GetGenomicGisForGeneId implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
GetProteinGisForGeneId implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
GetRNAGisForGeneId implementation, see IGeneInfoInput.
Implements IGeneInfoInput.
Fill the Gi list given a Gene ID, and the Gi field index, which represents the Gi type to be read from the file.
Set the offset value given a Gene ID.
Fill the Gene ID list given a Gi.
Set the offset value given a Gi.
|
private |
Memory-map all the files.
Read Gene data at the given offset and create the info object.
|
private |
Unmap all the memory-mapped files.
|
private |
Perform Gi to Offset lookups directly.
Definition at line 105 of file gene_info_reader.hpp.
|
private |
Input stream for the Gene data file.
Definition at line 120 of file gene_info_reader.hpp.
|
private |
Cached map of looked up Gene Info objects.
Definition at line 123 of file gene_info_reader.hpp.
|
private |
Memory-mapped Gene ID to Gi file.
Definition at line 117 of file gene_info_reader.hpp.
|
private |
Memory-mapped Gene ID to Offset file.
Definition at line 111 of file gene_info_reader.hpp.
|
private |
Memory-mapped Gi to Gene ID file.
Definition at line 108 of file gene_info_reader.hpp.
|
private |
Memory-mapped Gi to Offset file.
Definition at line 114 of file gene_info_reader.hpp.
|
private |
Path to the file containing all the Gene data.
Definition at line 102 of file gene_info_reader.hpp.
|
private |
Path to the Gene ID to Gi file.
Definition at line 99 of file gene_info_reader.hpp.
|
private |
Path to the Gene ID to Offset file.
Definition at line 93 of file gene_info_reader.hpp.
|
private |
Path to the Gi to Gene ID file.
Definition at line 90 of file gene_info_reader.hpp.
|
private |
Path to the Gi to Offset file.
Definition at line 96 of file gene_info_reader.hpp.