NCBI C++ ToolKit
data_histogram.hpp
Go to the documentation of this file.

Go to the SVN repository for this file.

1 #ifndef DATA_HISTOGRAM__HPP
2 #define DATA_HISTOGRAM__HPP
3 
4 /* $Id: data_histogram.hpp 96034 2022-01-31 15:51:09Z ivanov $
5  * ===========================================================================
6  *
7  * PUBLIC DOMAIN NOTICE
8  * National Center for Biotechnology Information
9  *
10  * This software/database is a "United States Government Work" under the
11  * terms of the United States Copyright Act. It was written as part of
12  * the author's official duties as a United States Government employee and
13  * thus cannot be copyrighted. This software/database is freely available
14  * to the public for use. The National Library of Medicine and the U.S.
15  * Government have not placed any restriction on its use or reproduction.
16  *
17  * Although all reasonable efforts have been taken to ensure the accuracy
18  * and reliability of the software and data, the NLM and the U.S.
19  * Government do not and cannot warrant the performance or results that
20  * may be obtained by using this software or data. The NLM and the U.S.
21  * Government disclaim all warranties, express or implied, including
22  * warranties of performance, merchantability or fitness for any particular
23  * purpose.
24  *
25  * Please cite the author in any work or product based on this material.
26  *
27  * ===========================================================================
28  *
29  * Author: Vladimir Ivanov
30  *
31  */
32 
33 /// @file data_histogram.hpp
34 /// Frequency histogram for data distribution of the numerical samples.
35 
36 #include <corelib/ncbistd.hpp>
37 #include <corelib/ncbistl.hpp>
38 
39 #define _USE_MATH_DEFINES // to define math constants in math.h
40 #include <math.h> // log/pow functions
41 #include <cmath> // additional overloads for pow()
42 #include <memory>
43 #include <mutex>
44 
45 
46 /** @addtogroup Statistics
47  *
48  * @{
49  */
50 
52 
53 
54 
55 /////////////////////////////////////////////////////////////////////////////
56 ///
57 /// Helper types for CHistogram<>::GetSum() support
58 //
59 
60 // if T doesn't have 'foo' method (operator+= in our case) with the signature
61 // that allows to compile the bellow expression then instantiating this template
62 // is Substitution Failure (SF) which Is Not An Error (INAE) if this happens
63 // during overload resolution
64 template <typename T>
65 using T_HistogramValueTypeHavePlus = decltype((T&)(std::declval<T>().operator+=(std::declval<const T&>())));
66 
67 // Checks that T is arithmetic type or have operator+= implemented
68 template <typename T>
70 {
72 }
73 
74 
75 /////////////////////////////////////////////////////////////////////////////
76 ///
77 /// CHistogram -- collect the distribution of the numerical data samples.
78 ///
79 /// CHistogram <TValue, TScale, TCounter>
80 ///
81 /// TValue - type of values that distribution will be collected.
82 /// This can be any numerical or user defined type, if it allow
83 /// comparison and can be converted to scale type TScale, see note.
84 ///
85 /// TScale - numerical type used to calculate bins starting positions and sizes.
86 /// For integer types some scales, logarithmic for examples, will lead
87 /// to truncating values and creating unequal bins, so TScale
88 /// should be float/double based.
89 ///
90 /// TCounter - type to store counters. Usually this is a positive integer type:
91 /// int, unsigned int, long, size_t, Uint8 and etc.
92 ///
93 /// @note
94 /// TValue and TScale types should be equal, or allow comparison and conversion
95 /// between them. Any types can be used as TValue if it have:
96 /// = default constructor: TValue() -- for initialization;
97 /// - operator TScale() const -- to convert to scale type TScale;
98 /// - bool operator >(const TValue&) -- for comparison.
99 ///
100 /// @warning
101 /// CHistogram is not MT safe by default, it is intended to be created
102 /// and collect data withing the same thread. If you want to use the same
103 /// CHistogram class object across multiple threads, you may need to provide
104 /// some sort of MT protection to guard access to the histogram object,
105 /// or enable internal MT protection, and call CHistogram::EnableMT()
106 /// immediately after creating a histogram object. It is turned OFF by default.
107 /// But note, internal MT protection cannnot protect histogram's data if you
108 /// use methods that affects 2 histograms at the same time, like
109 /// AddCountersFrom()/StealCountersFrom(). You still may need to add some
110 /// additional MT ptotections, if both histograms can be accessed from
111 /// a different threads simultaneously, or this can lead to a deadlock.
112 
113 template <typename TValue = int, typename TScale = TValue, typename TCounter = Uint8>
115 {
116 public:
117  /// Scale type.
118  /// Most common is a linear scale, where each bin have the same size.
119  /// It is good if you have a limited range, but sometimes values are distributed
120  /// very wide, and if you want to count all of them, but concentrate on a
121  /// some range with most significant values, logarithmic scales helps a lot.
122  /// For logarithmic scales each next bin have a greater size.
123  /// Bins size increasing depends on a logarithmic base (used scale type).
124  ///
125  enum EScaleType {
126  eLinear = 1, ///< Arithmetic or linear scale
127  eLog, ///< Natural logarithmic scale with a base e ~ 2.72
128  eLog2, ///< Binary logarithmic scale with a base 2
129  eLog10 ///< Common logarithmic scale with a base 10
130  };
131 
132  /// Methods to build bins for a specified scale.
133  /// Applies to the main scale, defined in constructor, only.
134  enum EScaleView {
135  /// Use specified scale method to calculate bins sizes from a minimum
136  /// to a maximum value.
138  /// Determine a mean for a specified value range and calculates
139  /// bins sizes using specified scale to both sides from it, symmetrically.
141  };
142 
143 
144  /////////////////////////////////////////////////////////////////////////
145  // Initialization
146 
147  /// Constructor.
148  ///
149  /// @param min_value
150  /// Minimum allowed value in range [min_value, max_value].
151  /// @param max_value
152  /// Maximum allowed value in range [min_value, max_value].
153  /// @param n_bins
154  /// Number of bins for the histogram (main scale).
155  /// @param scale_type
156  /// Predefined scale type. Corresponding scale function will be used
157  /// to calculate scale's bin ranges.
158  /// @param scale_view
159  /// Scale view (symmetrical or monotonic).
160  /// @sa EScaleType, EScaleView, AddLeftScale, AddRightScale, Add
161  ///
162  CHistogram(TValue min_value, TValue max_value, unsigned n_bins,
163  EScaleType scale_type = eLinear,
164  EScaleView scale_view = eMonotonic);
165 
166  /// Add auxiliary left/right scales.
167  ///
168  /// The CHistogram constructor defines a main scale for some data range.
169  /// It can be very limited to get more precise results for the value distribution.
170  /// You can get the number of hits whose values fall outside the range of the main
171  /// scale using GetLowerAnomalyCount()/GetUpperAnomalyCount() methods,
172  /// but each returns a single number of hits outside of the range.
173  /// The other method is to use auxiliary scale(s) to count the number of hits
174  /// for the less significant range(s). You can add any number of scales from each side.
175  /// So, you can use linear scale with much greater bin sizes, or logarithmic scales,
176  /// that allow to cover a very wide range of data with a limited number of bins.
177  /// See EScaleType for details.
178  ///
179  /// @param min_value
180  /// Minimum allowed value for the auxiliary scale.
181  /// Its maximum value is the same as a minimum value for the main scale,
182  /// or previously added scale.
183  /// @param n_bins
184  /// Number of bins for the auxiliary scale.
185  /// @param scale_type
186  /// Predefined scale type. Corresponding scale function will be used
187  /// to calculate scale's bin ranges.
188  /// @note
189  /// It is not allowed to add left/right auxiliary scales after starting counting hits.
190  /// Please use these methods before Add(), or after Reset().
191  /// @sa CHistogram::CHistogram, AddRightScale, EScaleType, GetLowerAnomalyCount, GetUpperAnomalyCount
192  ///
193  void AddLeftScale (TValue min_value, unsigned n_bins, EScaleType scale_type);
194  void AddRightScale(TValue max_value, unsigned n_bins, EScaleType scale_type);
195 
196 
197  /////////////////////////////////////////////////////////////////////////
198  // Populating
199 
200  /// Sum type: double for all floating points TValue types, int64_t/uint64_t for integral, and TValue otherwise.
201  using TIntegral = typename std::conditional<std::numeric_limits <TValue>::is_signed, int64_t, uint64_t >::type;
204 
205  /// Add value to the data distribution.
206  /// Try to find an appropriate bin for a specified value and increase its counter on 1.
207  /// Also, calculates a sum of all added values if TValue type have addition support.
208  /// @sa GetStarts, GetCounters, GetSum
209  ///
210  template <typename V, typename S = TSum,
211  std::enable_if_t<g_HistogramValueTypeHavePlus<S>(), int> = 0>
212  void Add(const V& v) {
213  MT_Lock();
214  x_Add(v);
215  // Calculate sum, TSum have operator+=
216  m_Sum += v;
217  MT_Unlock();
218  }
219  template <typename V, typename S = TSum,
220  std::enable_if_t<!g_HistogramValueTypeHavePlus<S>(), int> = 0>
221  void Add(const V& v) {
222  MT_Lock();
223  x_Add(v);
224  MT_Unlock();
225  }
226 
227  // Return value for a class member, MT safe
228  #define RETURN_MT_SAFE(member) \
229  if (m_IsMT) { \
230  MT_Lock(); \
231  auto v = member; \
232  MT_Unlock(); \
233  return v; \
234  } \
235  return member
236 
237  /// Return the sum of all added values.
238  /// @return
239  /// Returned type depends on TValue type, and converts to:
240  /// - double -- for all floating point types;
241  /// - int64_t -- signed integral types;
242  /// - uint64_t -- unsigned integral types;
243  /// - TValue -- for all other types.
244  /// The sum can be calculated for floating, integral types and all other
245  /// TValue types that have 'operator +=' defined.
246  /// If TValue type doesn't have such operator defined, empty value {} is returned.
247  /// Note, compiler can convert this empty {} value to TScale type, because every TValue
248  /// should have 'operator TScale() const' by design.
249  /// @note
250  /// This method doesn't check calculated sum on overflow.
251  /// @sa Add, TSum
252  TSum GetSum(void) const { RETURN_MT_SAFE(m_Sum); }
253 
254  /// Reset all data counters.
255  void Reset();
256 
257  /// Add counters from 'other' histogram to this histogram, 'other' doesn't changes.
258  /// @note Both histograms should have the same structure.
259  /// @sa Clone, Reset, StealCountersFrom
260  void AddCountersFrom(const CHistogram& other);
261 
262  /// Add counters from 'other' histogram to this histogram,
263  /// then reset the counters of 'other' histogram.
264  /// @note Both histograms should have the same structure.
265  /// @sa Clone, Reset, AddCountersFrom
267 
268 
269  /////////////////////////////////////////////////////////////////////////
270  // Structure
271 
272  /// Get the lower bound of the combined scale.
273  TValue GetMin() const { return m_Min; }
274 
275  /// Get the upper bound of the combined scale.
276  TValue GetMax() const { return m_Max; }
277 
278  /// Return the number ot bins on the combined scale.
279  /// @sa GetBinStarts, GetBinCounters, GetBinCountersPtr
280  unsigned GetNumberOfBins() const { return m_NumBins; }
281 
282  /// Get starting positions for bins on the combined scale.
283  ///
284  /// Populate a vector with starting positions for bins on the combined scale.
285  /// The size of the vector changes to be equal the number of bins.
286  /// @sa GetNumberOfBins, GetBinCounters, GetBinStartsPtr
287  void GetBinStarts(vector<TScale>& positions) {
288  MT_Lock();
289  positions.resize(m_NumBins);
290  positions.assign(m_Starts.get(), m_Starts.get() + m_NumBins);
291  MT_Unlock();
292  }
293 
294  /// Get starting positions for bins on the combined scale (not MT safe).
295  ///
296  /// Returns a pointer to array. The number of bins can be obtained with GetNumberOfBins().
297  /// and a number of a counters in each bin -- with GetBinCounters().
298  /// @warning
299  /// Be aware that any change in the histogram's structure, adding additional scales,
300  /// invalidate all data and pointer itself. This method is not intended to use in MT
301  /// environment. Please use vector version GetBinStarts(vector<TScale>&).
302  /// @sa GetNumberOfBins, GetBinCountersPtr, GetBinStarts, EnableMT
303  const TScale* GetBinStartsPtr() const { return m_Starts.get(); }
304 
305 
306  /////////////////////////////////////////////////////////////////////////
307  // Results
308 
309  /// Get counters for the combined scale's bins.
310  ///
311  /// Populate a vector with counters for the combined scale's bins.
312  /// The size of the vector changes to be equal to the number of bins.
313  /// @sa GetNumberOfBins, GetBinStarts, GetBinCountersPtr, Add
314  void GetBinCounters(vector<TCounter>& counters) {
315  MT_Lock();
316  counters.resize(m_NumBins);
317  counters.assign(m_Counters.get(), m_Counters.get() + m_NumBins);
318  MT_Unlock();
319  }
320 
321  /// Get counters for the combined scale's bins (not MT safe).
322  ///
323  /// Returns a pointer to array. The number of bins can be obtained with GetNumberOfBins().
324  /// @warning
325  /// Be aware that any change in the histogram's structure invalidate all data
326  /// and pointer itself. This method is not intended to use in MT environment.
327  /// Please use vector version GetBinCounters(vector<TCounter>&).
328  /// @sa GetNumberOfBins, GetBinStartsPtr, GetBinCounters, Add, EnableMT
329  const TCounter* GetBinCountersPtr() const { return m_Counters.get(); }
330 
331  /// Get total number of hits whose value fell between GetMin() and GetMax().
332  /// The number of hits whose values fall outside that range can be obtained
333  /// using GetLowerAnomalyCount() and GetUpperAnomalyCount() methods.
334  /// @sa GetMin, GetMax, GetLowerAnomalyCount, GetUpperAnomalyCount, GetBinCounters
335  TCounter GetCount() const { RETURN_MT_SAFE(m_Count); }
336 
337  /// Get number of hits whose values were less than GetMin().
338  /// @sa GetUpperAnomalyCount, GetCount
340 
341  /// Get number of hits whose values were greater than GetMax().
342  /// @sa GetLowerAnomalyCount, GetCount
344 
345 
346  /////////////////////////////////////////////////////////////////////////
347  // Helpers
348 
349  /// Rules to calculate an estimated numbers of bins on the base of the expected
350  /// number of observations.
351  /// Note, that there is no "best" number of bins, and different bin sizes can reveal
352  /// different features of the data. All these methods generally make strong assumptions
353  /// about the shape of the distribution. Depending on the actual data distribution
354  /// and the goals of the analysis, different bin widths may be appropriate,
355  /// so experimentation is usually needed to determine an appropriate width.
356  /// @note
357  /// All methods applies to the linear scale only, that have a fixed bin sizes.
358  /// @sa
359  /// EstimateNumberOfBins
360  ///
362  /// Square root rule. Used by Excel histograms and many others.
363  /// Default value for EstimateNumberOfBins() as well.
365  /// Juran's "Quality Control Handbook" that provide guidelines
366  /// to select the number of bins for histograms.
368  /// Herbert Sturge's rule. It works best for continuous data that is
369  /// normally distributed and symmetrical. As long as your data is not skewed,
370  /// using Sturge's rule should give you a nice-looking, easy to read
371  /// histogram that represents the data well.
373  /// Rice's rule. Presented as a simple alternative to Sturge's rule.
374  eRice
375  };
376 
377  /// Estimate numbers of bins on the base of the expected number of observations 'n'.
378  /// @note
379  /// Applies to the linear scale only.
380  /// @sa
381  /// EEstimateNumberOfBinsRules
382  static unsigned EstimateNumberOfBins(size_t n, EEstimateNumberOfBinsRule rule = 0);
383 
384 
385  /////////////////////////////////////////////////////////////////////////
386  // Move semantics
387 
388  /// Default constructor.
389  ///
390  /// Creates empty object. The object itself is invalid without min/max values,
391  /// and any scale. Should be used with conjunction of Clone() only.
392  /// @sa Clone
393  ///
394  CHistogram(void);
395 
396  /// Move constructor.
397  ///
398  /// Move 'other' histogram data to the current object. 'other' became invalid.
399  /// @example
400  /// CHistogram<> h1(min, max, n, type);
401  /// CHistogram<> h2(h1.Clone(how));
402  /// @sa Clone
403  ///
405 
406  /// Move assignment operator.
407  ///
408  /// Move 'other' histogram data to the current object. 'other' became invalid.
409  /// @example
410  /// CHistogram<> h1(min, max, n, type);
411  /// CHistogram<> h2;
412  /// h2 = h1.Clone(how);
413  /// @sa Clone
414  ///
416 
417  enum EClone {
418  eCloneAll = 0, ///< Clone whole histogram, with scale and counters
419  eCloneStructureOnly ///< Clone structure only (the counters will be zeroed)
420  };
421 
422  /// Clone histogram structure.
423  ///
424  /// Creates a copy of the histogram structure depending on clone method.
425  /// By default it is a whole copy, but you can clone histogram's structure only, without counters.
426  ///
428 
429 
430  /////////////////////////////////////////////////////////////////////////
431  // Optional MT protection
432 
433  /// Add MT protection to histogram.
434  ///
435  /// By default CHistogram is not MT protected and intended to be created
436  /// and used withing the same thread. Calling this method add MT protection
437  /// on adding additional scales, adding and resetting values, cloning and
438  /// moving histograms. Please call this method immediately after creating
439  /// each histogram, before accessing it from any other thread, or deadlock can occurs.
440  ///
441  void EnableMT(void) {
442  m_IsMT = true;
443  };
444  /// MT locking
445  void MT_Lock() const {
446  if (m_IsMT) m_Mutex.lock();
447  }
448  // MT unlocking
449  void MT_Unlock() const {
450  if (m_IsMT) m_Mutex.unlock();
451  }
452 
453 protected:
454  // Note, none of x_*() methods have MT locking, it should be done in public methods only.
455 
456  /// Calculate bins starting positions.
457  /// Calculate positions for 'n' number of bins in [start,end] value range,
458  /// starting from 'pos' bin index.
459  /// If start_value > end_value, calculating going from the right to left.
460  /// This is applicable for symmetrical scales, or multi-scales,
461  /// with left scale added.
462  void x_CalculateBins(TScale start_value, TScale end_value,
463  unsigned pos, unsigned n,
464  EScaleType scale_type, EScaleView scale_view);
465 
466  /// Calculate bins starting positions for a linear scale.
467  /// Account for an integer TScale type and calculation truncation.
469  TScale start_value, TScale end_value, TScale step,
470  TScale* arr, unsigned pos, unsigned n,
471  EScaleView scale_view);
472 
473  /// Calculate bins starting positions for a logarithmic scale.
475  TScale start_value, TScale end_value,
476  TScale* arr, unsigned pos, unsigned n,
477  EScaleType scale_type, EScaleView scale_view);
478 
479  /// Scale function.
480  /// Calculates scale position on a base of data value.
481  TScale x_Func(EScaleType scale_type, TScale value);
482 
483  /// Inverse scale function.
484  /// Calculates a data value on a base of scale value/position.
485  TScale x_FuncInverse(EScaleType scale_type, TScale scale_value);
486 
487  /// Add value to the data distribution (internal version without locking).
488  void x_Add(TValue value);
489 
490  /// Add value to the data distribution using a linear search method.
491  /// Usually faster than bisection method on a small number of bins,
492  /// or if the values have a tendency to fall into the starting bins
493  /// for a used scale.
494  /// @sa Add, x_AddBisection
495  void x_AddLinear(TValue value);
496 
497  /// Add value to the data distribution using a bisection search method.
498  /// Usually faster on a long scales with a lot of bins.
499  /// @sa Add, x_AddLinear
500  void x_AddBisection(TValue value);
501 
502  /// Move data from 'other' histogram. 'other' became invalid.
503  void x_MoveFrom(CHistogram& other);
504 
505  /// Add counters from 'other' histogram.
507 
508  /// Check that 'a' and 'b' scale values are equal (or almost equal for floating scales).
509  bool x_IsEqual(TScale a, TScale b);
510 
511  /// Reset all data counters (internal version without locking).
512  void x_Reset();
513 
514  /// Prevent copying.
515  /// See move constructor and move assignment operator to use with move-semantics.
518 
519 protected:
520  TValue m_Min; ///< Minimum value (the lower bound of combined scale)
521  TValue m_Max; ///< Maximum value (the upper bound of combined scale)
522  unsigned m_NumBins; ///< Number of bins (m_Starts[]/m_Counts[] length)
523  TSum m_Sum = {}; ///< Sum of the all added values (if applicable for TValue)
524 
525  std::unique_ptr<TScale[]> m_Starts; ///< Combined scale: starting bins positions
526  std::unique_ptr<TCounter[]> m_Counters; ///< Combined scale: counters - the number of measurements for each bin
527 
528  TCounter m_Count; ///< Number of counted values (sum all m_Counters[])
529  TCounter m_LowerAnomalyCount; ///< Number of anomaly values < m_Min
530  TCounter m_UpperAnomalyCount; ///< Number of anomaly values > m_Max
531 
532  bool m_IsMT; ///< MT protection flag
533  mutable std::mutex m_Mutex; ///< MT protection mutex
534 };
535 
536 
537 
538 /// A series of same-structured histograms covering logarithmically (base 2)
539 /// increasing time periods... roughly
540 ///
541 template <typename TValue, typename TScale, typename TCounter>
543 {
544 public:
545  /// @param model_histogram
546  /// This histogram will be used as a template for all histogram objects
547  /// in this time series.
549  CHistogramTimeSeries(THistogram& model_histogram);
550 
551  /// Add value to the data distribution.
552  /// Try to find an appropriate bin for a specified value and increment its counter by 1.
553  /// @sa CHistogram::Add()
554  void Add(TValue value);
555 
556  /// Merge the most recent (now active) histogram data into the time series.
557  /// This method is generally supposed to be called periodically, thus
558  /// defining the unit of time (tick) as far as this API is concerned.
559  void Rotate();
560 
561  /// Reset to the initial state
562  void Reset();
563 
564  /// Type of the unit of time
565  using TTicks = unsigned int;
566 
567  /// A histograms which covers a certain number of ticks
568  struct STimeBin {
569  // Copy constructor and assignment operator should be passed with 'const'
570  // qualifier to operate on list<STimeBin>. But Clone() is not 'const'
571  // due optional protection, so we use const_cast<> here. It is safe,
572  // because Clone() change internal mutex only.
573  STimeBin(const STimeBin& other) :
574  histogram(other.histogram.Clone(THistogram::eCloneAll)),
575  n_ticks(other.n_ticks)
576  {}
577  STimeBin& operator=(const STimeBin& other)
578  {
580  n_ticks = other.n_ticks;
581  }
583  histogram(std::move(h)), n_ticks(t)
584  {}
585 
586  THistogram histogram; ///< Histogram for the ticks
587  TTicks n_ticks; ///< Number of ticks in this histogram
588  };
589  /// Type of the series of histograms
590  using TTimeBins = list<STimeBin>;
591 
592  /// Histograms -- in the order from the most recent to the least recent
593  TTimeBins GetHistograms() const;
594 
595  /// Number of ticks the histogram series has handled.
596  /// Initially the number of ticks is zero.
597  TTicks GetCurrentTick(void) const { return m_CurrentTick; }
598 
599 private:
600  void x_AppendBin(const THistogram& model_histogram, TTicks n_ticks);
601  void x_Shift(size_t index, typename TTimeBins::iterator current_it);
602 
603 private:
605  mutable std::mutex m_Mutex; // for Rotate() and GetHistograms()
607 };
608 
609 
610 /* @} */
611 
612 
613 //////////////////////////////////////////////////////////////////////////////
614 //
615 // Inline
616 //
617 
618 template <typename TValue, typename TScale, typename TCounter>
620 (
621  TValue min_value,
622  TValue max_value,
623  unsigned n_bins,
624  EScaleType scale_type,
625  EScaleView scale_view
626 )
627  : m_Min(min_value), m_Max(max_value), m_NumBins(n_bins), m_IsMT(false)
628 {
629  if ( m_Min > m_Max ) {
630  NCBI_THROW(CCoreException, eInvalidArg, "Minimum value cannot exceed maximum value");
631  }
632  if ( !m_NumBins ) {
633  NCBI_THROW(CCoreException, eInvalidArg, "Number of bins cannot be zero");
634  }
635  // Allocate memory
636  m_Starts.reset(new TScale[m_NumBins]);
637  m_Counters.reset(new TCounter[m_NumBins]);
638 
639  x_CalculateBins(m_Min, m_Max, 0, m_NumBins, scale_type, scale_view);
640  // Reset counters
641  x_Reset();
642 }
643 
644 
645 template <typename TValue, typename TScale, typename TCounter>
646 void
648  TValue min_value, unsigned n_bins, EScaleType scale_type)
649 {
650  MT_Lock();
651 
652  if ( min_value >= m_Min ) {
653  MT_Unlock();
654  NCBI_THROW(CCoreException, eInvalidArg, "New minimum value cannot exceed minimum value for the histogram");
655  }
656  if ( !n_bins ) {
657  MT_Unlock();
658  NCBI_THROW(CCoreException, eInvalidArg, "Number of bins cannot be zero");
659  }
661  MT_Unlock();
662  NCBI_THROW(CCoreException, eInvalidArg, "Please call AddLeftScale() before Add()");
663  }
664  unsigned n_prev = m_NumBins;
665  m_NumBins += n_bins;
666 
667  // Reallocate memory for starting positions and counters
668 
669  std::unique_ptr<TScale[]> tmp_starts(new TScale[m_NumBins]);
670  memcpy(tmp_starts.get() + n_bins, m_Starts.get(), sizeof(TScale) * n_prev);
671  m_Starts.swap(tmp_starts);
672  m_Counters.reset(new TCounter[m_NumBins]);
673  memset(m_Counters.get(), 0, m_NumBins * sizeof(TCounter));
674 
675  // Calculate scale for newly added bins: from right to left
676  x_CalculateBins(m_Min, min_value, n_bins-1, n_bins, scale_type, eMonotonic);
677  m_Min = min_value;
678 
679  MT_Unlock();
680 }
681 
682 
683 template <typename TValue, typename TScale, typename TCounter>
684 void
686  TValue max_value, unsigned n_bins, EScaleType scale_type)
687 {
688  MT_Lock();
689 
690  if ( max_value <= m_Max ) {
691  MT_Unlock();
692  NCBI_THROW(CCoreException, eInvalidArg, "New maximum value cannot be less than a maximum value for the histogram");
693  }
694  if ( !n_bins ) {
695  MT_Unlock();
696  NCBI_THROW(CCoreException, eInvalidArg, "Number of bins cannot be zero");
697  }
699  MT_Unlock();
700  NCBI_THROW(CCoreException, eInvalidArg, "Please call AddRightScale() before Add()");
701  }
702  unsigned n_prev = m_NumBins;
703  m_NumBins += n_bins;
704 
705  // Reallocate memory for starting positions and counters
706 
707  std::unique_ptr<TScale[]> tmp_starts(new TScale[m_NumBins]);
708  memcpy(tmp_starts.get(), m_Starts.get(), sizeof(TScale) * n_prev);
709  m_Starts.swap(tmp_starts);
710  m_Counters.reset(new TCounter[m_NumBins]);
711  memset(m_Counters.get(), 0, m_NumBins * sizeof(TCounter));
712 
713  // Calculate scale for newly added bins: from left to right
714  x_CalculateBins(m_Max, max_value, n_prev, n_bins, scale_type, eMonotonic);
715  m_Max = max_value;
716 
717  MT_Unlock();
718 }
719 
720 
721 template <typename TValue, typename TScale, typename TCounter>
722 unsigned
723 CHistogram<TValue, TScale, TCounter>::EstimateNumberOfBins(size_t n, EEstimateNumberOfBinsRule rule)
724 {
725  switch (rule) {
726  case eSquareRoot:
727  return (unsigned) ceil(sqrt(n));
728  case eJuran:
729  // It have no info for n < 20, but 5 is a reasonable minimum
730  if (n < 20) return 5;
731  if (n <= 50) return 6;
732  if (n <= 100) return 7;
733  if (n <= 200) return 8;
734  if (n <= 500) return 9;
735  if (n <= 1000) return 10;
736  return 11; // handbook: 11-20 for 1000+ observations
737  case eSturge:
738  return 1 + (unsigned) ceil(log2(n));
739  case eRice:
740  return (unsigned) ceil(pow(n, double(1)/3));
741  }
742  _TROUBLE;
743  // unreachable, just to avoid warning
744  return 0;
745 }
746 
747 
748 template <typename TValue, typename TScale, typename TCounter>
749 void
751 {
752  MT_Lock();
753  x_Reset();
754  MT_Unlock();
755 }
756 
757 
758 template <typename TValue, typename TScale, typename TCounter>
759 void
761 {
762  // Reset counters
764  // Reset bins
765  memset(m_Counters.get(), 0, m_NumBins * sizeof(TCounter));
766  // Reset sum (it can be any type, so use default constructor)
767  m_Sum = {};
768 }
769 
770 
771 template <typename TValue, typename TScale, typename TCounter>
772 void
774 {
775  if (value < m_Min) {
777  return;
778  }
779  if (value > m_Max) {
781  return;
782  }
783  if (m_NumBins < 50) {
785  return;
786  }
788 }
789 
790 
791 template <typename TValue, typename TScale, typename TCounter>
792 void
794 {
795  size_t i = 1;
796  while (i < m_NumBins) {
797  if (value < m_Starts[i]) {
798  break;
799  }
800  i++;
801  }
802  // The current bin with index 'i' have a greater value, so put value into the previous bin
803  m_Counters[i-1]++;
804  m_Count++;
805 }
806 
807 
808 template <typename TValue, typename TScale, typename TCounter>
809 void
811 {
812  // Bisection search
813 
814  size_t left = 0, right = m_NumBins;
815  size_t i = 0, d;
816 
817  while (d = (right - left), d > 1) {
818  i = left + d / 2;
819  if (value < m_Starts[i]) {
820  right = i;
821  } else {
822  left = i;
823  }
824  }
825  // Algorithm can finish on the current or next bin, depending on even/odd number of bins
826  if (value < m_Starts[i]) {
827  i--;
828  }
829  m_Counters[i]++;
830  m_Count++;
831 }
832 
833 
834 template <typename TValue, typename TScale, typename TCounter>
835 void
837 (
838  TScale start_value,
839  TScale end_value,
840  unsigned pos,
841  unsigned n,
842  EScaleType scale_type,
843  EScaleView scale_view
844 )
845 {
846  const char* errmsg_step = "Impossible to calculate scale step, please change TScale type, range, or number of bins";
847  const char* errmsg_dup = "Impossible to calculate scales bin starting position, please change TScale type, range, or number of bins";
848 
849  TScale* arr = m_Starts.get();
850 
851  if (scale_type == eLinear) {
852  // Special processing for linear scales to account for
853  // an integer TScale type and calculation truncation.
854  TScale step = (max(start_value, end_value) - min(start_value, end_value)) / n;
855  if ( step == 0 ) {
856  NCBI_THROW(CCoreException, eCore, errmsg_step);
857  }
858  x_CalculateBinsLinear(start_value, end_value, step, arr, pos, n, scale_view);
859  return;
860  }
861 
862  // Log scales only
863  _ASSERT( scale_type == eLog ||
864  scale_type == eLog2 ||
865  scale_type == eLog10 );
866 
867  x_CalculateBinsLog(start_value, end_value, arr, pos, n, scale_type, scale_view);
868 
869  // Check that we don't have bins with the same starting positions.
870  // This can happens mostly if TScale is an integer type due truncation.
871  // Linear scale doesn't need this, because we check 'step' instead (see above).
872  // Checking whole combined scale.
873  //
874  for (unsigned i = 1; i < m_NumBins; i++) {
875  if ( arr[i] <= arr[i-1] ) {
876  NCBI_THROW(CCoreException, eCore, errmsg_dup);
877  }
878  }
879  return;
880 }
881 
882 
883 template <typename TValue, typename TScale, typename TCounter>
884 void
886 (
887  TScale start_value,
888  TScale end_value,
889  TScale step,
890  TScale* arr,
891  unsigned pos,
892  unsigned n,
893  EScaleView scale_view
894 )
895 {
896  if (scale_view == eSymmetrical) {
897  // Account for an integer TScale type and calculation truncation.
898  // If TScale type doesn't allow integer division to the specified
899  // number of bins without residue, some bins will be larger than others.
900  // For monotonic view this will be always a very last bin at the end of
901  // the scale, but for the symmetrical view it depends on parity of 'n'.
902  // For even number of bins most left and right bins can be larger.
903  // For odd number of bins the center bin can be larger than other.
904 
905  if (n % 2 == 0) {
906  // Calculate from the center to both sides with step 'step'
907  TScale median = start_value + (end_value - start_value)/2;
908  x_CalculateBinsLinear(median, start_value, step, arr, pos + n/2 - 1, n/2, eMonotonic);
909  x_CalculateBinsLinear(median, end_value, step, arr, pos + n/2, n/2, eMonotonic);
910  // We already know value for the most left bin, assign it
911  // implicitly for such symmetrical scales to have bigger
912  // starting bin. We got bigger ending bin automatically.
913  arr[pos] = start_value;
914 
915  } else {
916  // Calculate from both sides to the center with step 'step'
917  // 2nd parameter doesn't matter, it just show direction.
918  x_CalculateBinsLinear(start_value, end_value, step, arr, pos, n/2 + 1, eMonotonic);
919  x_CalculateBinsLinear(end_value, start_value, step, arr, n-1, n/2, eMonotonic);
920  }
921  return;
922  }
923 
924  // Calculate from left to right
925  if (start_value <= end_value) {
926  for (unsigned i = 0; i < n; i++) {
927  arr[pos+i] = start_value + step*i;
928  }
929  }
930  // Calculate from right to left
931  else {
932  for (unsigned i = 1; i <= n; i++) {
933  arr[pos+1-i] = start_value - step*i;
934  }
935  }
936 }
937 
938 
939 template <typename TValue, typename TScale, typename TCounter>
940 void
942 (
943  TScale start_value,
944  TScale end_value,
945  TScale* arr,
946  unsigned pos,
947  unsigned n,
948  EScaleType scale_type,
949  EScaleView scale_view
950 )
951 {
952  if (scale_view == eSymmetrical) {
953  // Account for an integer TScale type and calculation truncation.
954  // If TScale type doesn't allow integer division to the specified
955  // number of bins without residue, some bins will be larger than others.
956  // For monotonic view this will be always a very last bin at the end of
957  // the scale, but for the symmetrical view it depends on parity of 'n'.
958  // For even number of bins most left and right bins can be larger.
959  // For odd number of bins the center bin can be larger than other.
960 
961  TScale median = start_value + (end_value - start_value)/2;
962  // Set most left bin directly, we already know its value
963  arr[pos] = start_value;
964  unsigned n2 = n/2;
965 
966  if (n % 2 == 0) {
967  // Calculate from the center to both sides
968  x_CalculateBinsLog(median, start_value, arr, pos + n2 - 1, n2, scale_type, eMonotonic);
969  x_CalculateBinsLog(median, end_value, arr, pos + n2 , n2, scale_type, eMonotonic);
970  } else {
971  if (n == 1) {
972  return;
973  }
974  _ASSERT(n > 1);
975 
976  // Median is located in the middle of the center bin.
977  // To calculate starting point for this center bin end its ending point,
978  // by the way also a starting point for the next bin, we double the number of bins
979  // on each side of the median, and calculate values for each half-mark on a scale.
980  // After that copy values for needed half-marks only.
981 
982  std::unique_ptr<TScale[]> tmp(new TScale[n*2]);
983  TScale* tmp_arr = tmp.get();
984  // Calculate from the center to both sides (n % 2 == 0 case)
985  x_CalculateBinsLog(median, start_value, tmp_arr, n - 1, n, scale_type, eMonotonic);
986  x_CalculateBinsLog(median, end_value, tmp_arr, n , n, scale_type, eMonotonic);
987 
988  // For central bin: copy values for 2 half-marks from both sides of the center
989  arr[pos + n2] = tmp_arr[n - 1];
990  arr[pos + n2 + 1] = tmp_arr[n + 1];
991 
992  // For other bins: copy values for every second half-mark from 2 half-marks above
993  for (unsigned i = 0; i < n2; i++) {
994  arr[pos + i] = tmp_arr[i*2];
995  }
996  for (unsigned i=2, j=n+3; i <= n2; i++, j+=2) {
997  arr[pos + n2 + i] = tmp_arr[j];
998  }
999  }
1000  return;
1001  }
1002 
1003  // Logarithm functions cannot operate on a negative values or 0.
1004  // so it will be necessary to transform data range. One of the methods
1005  // is too shift data to start it from 1. This method also helps to normalize
1006  // logarithmic scale, and we will use it for all data, even positive.
1007  // So, we calculate logarithms scale for a range of values [1:D+1],
1008  // where D is a distance between starting and ending points.
1009  // Because x^0 = 1, we need calculate only the maximum scale value log(D+1),
1010  // And every scale mark in between with the step log(D+1)/N.
1011  // Inverse function helps to convert every scale mark back to a bin
1012  // starting position.
1013 
1014  // Calculate from left to right
1015  if (start_value <= end_value) {
1016  TScale step = x_Func(scale_type, end_value - start_value + 1) / n;
1017  for (unsigned i = 1; i < n; i++) {
1018  // Calculate data value for a given scale mark
1019  arr[pos+i] = start_value + x_FuncInverse(scale_type, step*i) - 1;
1020  }
1021  // Set most left bin directly, we already know its value,
1022  // to avoid possible rounding errors.
1023  arr[pos] = start_value;
1024  }
1025  // Calculate from right to left
1026  else {
1027  TScale step = x_Func(scale_type, start_value - end_value + 1) / n;
1028  for (unsigned i = 1; i < n; i++) {
1029  // Calculate data value for a given scale mark
1030  arr[pos+1-i] = start_value - x_FuncInverse(scale_type, step*i) + 1;
1031  }
1032  // Set most left bin directly, we already know its value,
1033  // to avoid possible rounding errors.
1034  arr[pos+1-n] = end_value;
1035  }
1036 }
1037 
1038 
1039 template <typename TValue, typename TScale, typename TCounter>
1040 TScale
1042 {
1043  switch (scale_type) {
1044  case eLog:
1045  return (TScale)log(value);
1046  case eLog2:
1047  return (TScale)log2(value);
1048  case eLog10:
1049  return (TScale)log10(value);
1050  case eLinear:
1051  // It is last, because eLinear has served by x_CalculateBinsLinear() directly
1052  return value;
1053  }
1054  _TROUBLE;
1055  // unreachable, just to avoid warning
1056  return value;
1057 }
1058 
1059 
1060 template <typename TValue, typename TScale, typename TCounter>
1061 TScale
1063 {
1064  switch (scale_type) {
1065  case eLog:
1066  return (TScale)pow(M_E, scale_value);
1067  case eLog2:
1068  return (TScale)pow(2, scale_value);
1069  case eLog10:
1070  return (TScale)pow(10, scale_value);
1071  case eLinear:
1072  // It is last, because eLinear has served by x_CalculateBinsLinear() directly
1073  return scale_value;
1074  }
1075  _TROUBLE;
1076  // unreachable, just to avoid warning
1077  return scale_value;
1078 }
1079 
1080 
1081 // Move semantics
1082 
1083 template <typename TValue, typename TScale, typename TCounter>
1085  : m_NumBins(0), m_IsMT(false)
1086 {
1087  // Reset counters
1088  x_Reset();
1089 }
1090 
1091 
1092 template <typename TValue, typename TScale, typename TCounter>
1094 {
1095  if (this == &other) return;
1096  other.MT_Lock();
1097  x_MoveFrom(other);
1098  other.MT_Unlock();
1099 };
1100 
1101 
1102 template <typename TValue, typename TScale, typename TCounter>
1105 {
1106  if (this == &other) return *this;
1107  other.MT_Lock();
1108  x_MoveFrom(other);
1109  other.MT_Unlock();
1110  return *this;
1111  };
1112 
1113 
1114 template <typename TValue, typename TScale, typename TCounter>
1117 {
1118  MT_Lock();
1119 
1120  CHistogram h;
1121  h.m_Min = m_Min;
1122  h.m_Max = m_Max;
1123  h.m_NumBins = m_NumBins;
1124  h.m_IsMT = m_IsMT;
1125 
1126  h.m_Starts.reset(new TScale[m_NumBins]);
1127  h.m_Counters.reset(new TCounter[m_NumBins]);
1128  memcpy(h.m_Starts.get(), m_Starts.get(), sizeof(TScale) * m_NumBins);
1129 
1130  switch (how) {
1131  case eCloneStructureOnly:
1132  // Reset counters
1133  h.m_Sum = 0;
1134  h.m_Count = 0;
1135  h.m_LowerAnomalyCount = 0;
1136  h.m_UpperAnomalyCount = 0;
1137  memset(h.m_Counters.get(), 0, m_NumBins * sizeof(TCounter));
1138  break;
1139  case eCloneAll:
1140  // Copy counters
1141  h.m_Sum = m_Sum;
1142  h.m_Count = m_Count;
1145  memcpy(h.m_Counters.get(), m_Counters.get(), sizeof(TCounter) * m_NumBins);
1146  break;
1147  default:
1148  MT_Unlock();
1149  _TROUBLE;
1150  }
1151  MT_Unlock();
1152  return h;
1153 }
1154 
1155 
1156 template <typename TValue, typename TScale, typename TCounter>
1157 void
1159 {
1160  m_Min = other.m_Min;
1161  m_Max = other.m_Max;
1162  m_NumBins = other.m_NumBins;
1163  m_Sum = other.m_Sum;
1164  m_Count = other.m_Count;
1167  m_IsMT = other.m_IsMT;
1168 
1169  m_Starts.reset(other.m_Starts.release());
1170  m_Counters.reset(other.m_Counters.release());
1171  other.m_NumBins = 0;
1172  other.m_Sum = 0;
1173 
1174  return;
1175 }
1176 
1177 
1178 template <typename TValue, typename TScale, typename TCounter>
1179 bool
1181 {
1182  if (std::numeric_limits<TScale>::is_integer) {
1183  return a == b;
1184  }
1185  // Approximately check for floating numbers
1186  return std::fabs(a - b) < 0.0001;
1187 }
1188 
1189 
1190 template <typename TValue, typename TScale, typename TCounter>
1191 void
1193 {
1194  if (this == &other) return;
1195 
1196  MT_Lock();
1197  other.MT_Lock();
1198  try {
1199  x_AddCountersFrom(const_cast<CHistogram&>(other));
1200  }
1201  catch (...) {
1202  other.MT_Unlock();
1203  MT_Unlock();
1204  throw;
1205  }
1206  other.MT_Unlock();
1207  MT_Unlock();
1208 }
1209 
1210 
1211 template <typename TValue, typename TScale, typename TCounter>
1212 void
1214 {
1215  if (this == &other) return;
1216 
1217  MT_Lock();
1218  other.MT_Lock();
1219  try {
1220  x_AddCountersFrom(other);
1221  other.x_Reset();
1222  }
1223  catch (...) {
1224  other.MT_Unlock();
1225  MT_Unlock();
1226  throw;
1227  }
1228  other.MT_Unlock();
1229  MT_Unlock();
1230 }
1231 
1232 
1233 template <typename TValue, typename TScale, typename TCounter>
1234 void
1236 {
1237  // Check structure
1238  if ( m_NumBins != other.m_NumBins ||
1239  !x_IsEqual(m_Min, other.m_Min) ||
1240  !x_IsEqual(m_Max, other.m_Max)
1241  ) {
1242  NCBI_THROW(CCoreException, eInvalidArg, "Histograms have different structure");
1243  }
1244  // Check scale
1245  TScale* starts_cur = m_Starts.get();
1246  TScale* starts_other = other.m_Starts.get();
1247  for (unsigned i = 0; i < m_NumBins; i++) {
1248  if (!x_IsEqual(starts_cur[i], starts_other[i])) {
1249  NCBI_THROW(CCoreException, eInvalidArg, "Histograms have different starting positions");
1250  }
1251  }
1252  // Add counters
1253  TCounter* counters_cur = m_Counters.get();
1254  TCounter* counters_other = other.m_Counters.get();
1255  for (unsigned i = 0; i < m_NumBins; i++) {
1256  counters_cur[i] += counters_other[i];
1257  }
1258  m_Count += other.m_Count;
1261  m_Sum += other.m_Sum;
1262 }
1263 
1264 
1265 
1266 //============================================================================
1267 //
1268 // CHistogramTimeSeries
1269 //
1270 //============================================================================
1271 
1272 template <typename TValue, typename TScale, typename TCounter>
1274  : m_CurrentTick(0)
1275 {
1276  // Need to create the first item in the list
1277  x_AppendBin(model_histogram, 1);
1278 }
1279 
1280 
1281 template <typename TValue, typename TScale, typename TCounter>
1282 void
1284 {
1285  // The first bin always exists
1286  m_TimeBins.front().histogram.Add(value);
1287 }
1288 
1289 
1290 template <typename TValue, typename TScale, typename TCounter>
1291 void
1293 {
1294  ++m_CurrentTick;
1295 
1296  m_Mutex.lock();
1297  x_Shift(0, m_TimeBins.begin());
1298  m_Mutex.unlock();
1299 }
1300 
1301 
1302 template <typename TValue, typename TScale, typename TCounter>
1303 void
1305 {
1306  m_Mutex.lock();
1307  m_CurrentTick = 0;
1308  while (m_TimeBins.size() > 1)
1309  m_TimeBins.pop_back();
1310  m_TimeBins.front().histogram.Reset();
1311  m_TimeBins.front().n_ticks = 1; // Just in case
1312  m_Mutex.unlock();
1313 }
1314 
1315 
1316 template <typename TValue, typename TScale, typename TCounter>
1317 void
1319  typename ncbi::CHistogramTimeSeries<TValue, TScale, TCounter>::TTimeBins::iterator current_it)
1320 {
1321  if (m_TimeBins.size() <= index + 1) {
1322  // There is not enough bins. Need to add one
1323  x_AppendBin(m_TimeBins.front().histogram, TTicks(1) << index);
1324 
1325  // Move from index to index + 1
1326  auto next_it = current_it;
1327  ++next_it;
1328  next_it->histogram.StealCountersFrom(current_it->histogram);
1329 
1330  if (index != 0)
1331  current_it->n_ticks /= 2;
1332  return;
1333  }
1334 
1335  auto next_it = current_it;
1336  ++next_it;
1337 
1338  if (next_it->n_ticks == TTicks(1) << index) {
1339  // Half full bin; make it full and just accumulate
1340  next_it->n_ticks *= 2;
1341  next_it->histogram.StealCountersFrom(current_it->histogram);
1342  if (index != 0)
1343  current_it->n_ticks /= 2;
1344  return;
1345  }
1346 
1347  // The next bin is full; need to move further
1348  x_Shift(index + 1, next_it);
1349  // Move to the vacant next
1350  next_it->histogram.StealCountersFrom(current_it->histogram);
1351  // The current one is a half now
1352  if (index != 0)
1353  current_it->n_ticks /= 2;
1354 }
1355 
1356 
1357 template <typename TValue, typename TScale, typename TCounter>
1360 {
1361  m_Mutex.lock();
1362  TTimeBins ret = m_TimeBins;
1363  m_Mutex.unlock();
1364  return ret;
1365 }
1366 
1367 
1368 template <typename TValue, typename TScale, typename TCounter>
1369 void
1371  const THistogram& model_histogram, TTicks n_ticks)
1372 {
1373  m_TimeBins.push_back(
1374  STimeBin{model_histogram.Clone(THistogram::eCloneStructureOnly),
1375  n_ticks} );
1376 }
1377 
1378 
1380 
1381 #endif /* DATA_HISTOGRAM__HPP */
CCoreException –.
Definition: ncbiexpt.hpp:1476
A series of same-structured histograms covering logarithmically (base 2) increasing time periods....
CHistogram – collect the distribution of the numerical data samples.
void Add(const V &v)
Include a standard set of the NCBI C++ Toolkit most basic headers.
#define T(s)
Definition: common.h:230
#define false
Definition: bool.h:36
static int type
Definition: getdata.c:31
static char tmp[3200]
Definition: utf8.c:42
Uint8 uint64_t
Int8 int64_t
#define NCBI_THROW(exception_class, err_code, message)
Generic macro to throw an exception, given the exception class, error code and message string.
Definition: ncbiexpt.hpp:704
EScaleType
Definition: glpane.hpp:47
#define END_NCBI_SCOPE
End previously defined NCBI scope.
Definition: ncbistl.hpp:103
#define BEGIN_NCBI_SCOPE
Define ncbi namespace.
Definition: ncbistl.hpp:100
unsigned GetNumberOfBins() const
Return the number ot bins on the combined scale.
void x_AddCountersFrom(CHistogram &other)
Add counters from 'other' histogram.
TCounter m_UpperAnomalyCount
Number of anomaly values > m_Max.
EEstimateNumberOfBinsRule
Rules to calculate an estimated numbers of bins on the base of the expected number of observations.
std::unique_ptr< TCounter[]> m_Counters
Combined scale: counters - the number of measurements for each bin.
unsigned int TTicks
Type of the unit of time.
typename std::conditional< std::is_floating_point< TValue >::value, double, TIntegral >::type TArithmetic
void x_AddBisection(TValue value)
Add value to the data distribution using a bisection search method.
void x_AppendBin(const THistogram &model_histogram, TTicks n_ticks)
CHistogram(CHistogram &&other)
void MT_Lock() const
MT locking.
void x_CalculateBinsLog(TScale start_value, TScale end_value, TScale *arr, unsigned pos, unsigned n, EScaleType scale_type, EScaleView scale_view)
Calculate bins starting positions for a logarithmic scale.
TCounter m_LowerAnomalyCount
Number of anomaly values < m_Min.
CHistogram Clone(EClone how=eCloneAll) const
Clone histogram structure.
void x_Add(TValue value)
Add value to the data distribution (internal version without locking).
TSum m_Sum
Sum of the all added values (if applicable for TValue)
THistogram histogram
Histogram for the ticks.
TTicks GetCurrentTick(void) const
Number of ticks the histogram series has handled.
list< STimeBin > TTimeBins
Type of the series of histograms.
TSum GetSum(void) const
Return the sum of all added values.
CHistogram & operator=(const CHistogram &)
TCounter m_Count
Number of counted values (sum all m_Counters[])
void x_CalculateBinsLinear(TScale start_value, TScale end_value, TScale step, TScale *arr, unsigned pos, unsigned n, EScaleView scale_view)
Calculate bins starting positions for a linear scale.
size_t GetLowerAnomalyCount() const
Get number of hits whose values were less than GetMin().
TValue GetMin() const
Get the lower bound of the combined scale.
typename std::conditional< std::is_arithmetic< TValue >::value, TArithmetic, TValue >::type TSum
std::mutex m_Mutex
MT protection mutex.
TScale x_FuncInverse(EScaleType scale_type, TScale scale_value)
Inverse scale function.
void AddCountersFrom(const CHistogram &other)
Add counters from 'other' histogram to this histogram, 'other' doesn't changes.
void GetBinCounters(vector< TCounter > &counters)
Get counters for the combined scale's bins.
void x_CalculateBins(TScale start_value, TScale end_value, unsigned pos, unsigned n, EScaleType scale_type, EScaleView scale_view)
Calculate bins starting positions.
void EnableMT(void)
Add MT protection to histogram.
TValue m_Max
Maximum value (the upper bound of combined scale)
const TScale * GetBinStartsPtr() const
Get starting positions for bins on the combined scale (not MT safe).
bool x_IsEqual(TScale a, TScale b)
Check that 'a' and 'b' scale values are equal (or almost equal for floating scales).
TValue GetMax() const
Get the upper bound of the combined scale.
TValue m_Min
Minimum value (the lower bound of combined scale)
CHistogram & operator=(CHistogram &&other)
TScale x_Func(EScaleType scale_type, TScale value)
Scale function.
void x_AddLinear(TValue value)
Add value to the data distribution using a linear search method.
STimeBin & operator=(const STimeBin &other)
constexpr bool g_HistogramValueTypeHavePlus()
unsigned m_NumBins
Number of bins (m_Starts[]/m_Counts[] length)
CHistogram(const CHistogram &)
Prevent copying.
void Add(TValue value)
Add value to the data distribution.
STimeBin(THistogram &&h, TTicks t)
void x_MoveFrom(CHistogram &other)
Move data from 'other' histogram. 'other' became invalid.
void AddLeftScale(TValue min_value, unsigned n_bins, EScaleType scale_type)
Add auxiliary left/right scales.
typename std::conditional< std::numeric_limits< TValue >::is_signed, int64_t, uint64_t >::type TIntegral
Sum type: double for all floating points TValue types, int64_t/uint64_t for integral,...
const TCounter * GetBinCountersPtr() const
Get counters for the combined scale's bins (not MT safe).
void Add(const V &v)
Add value to the data distribution.
std::unique_ptr< TScale[]> m_Starts
Combined scale: starting bins positions.
TCounter GetCount() const
Get total number of hits whose value fell between GetMin() and GetMax().
CHistogram(void)
Default constructor.
size_t GetUpperAnomalyCount() const
Get number of hits whose values were greater than GetMax().
void Rotate()
Merge the most recent (now active) histogram data into the time series.
STimeBin(const STimeBin &other)
CHistogram< TValue, TScale, TCounter > THistogram
void x_Shift(size_t index, typename TTimeBins::iterator current_it)
CHistogram(TValue min_value, TValue max_value, unsigned n_bins, EScaleType scale_type=eLinear, EScaleView scale_view=eMonotonic)
Constructor.
static unsigned EstimateNumberOfBins(size_t n, EEstimateNumberOfBinsRule rule=0)
Estimate numbers of bins on the base of the expected number of observations 'n'.
void MT_Unlock() const
EScaleView
Methods to build bins for a specified scale.
#define RETURN_MT_SAFE(member)
void Reset()
Reset to the initial state.
CHistogramTimeSeries(THistogram &model_histogram)
void Reset()
Reset all data counters.
bool m_IsMT
MT protection flag.
decltype((T &)(std::declval< T >().operator+=(std::declval< const T & >()))) T_HistogramValueTypeHavePlus
Helper types for CHistogram<>::GetSum() support.
void GetBinStarts(vector< TScale > &positions)
Get starting positions for bins on the combined scale.
void StealCountersFrom(CHistogram &other)
Add counters from 'other' histogram to this histogram, then reset the counters of 'other' histogram.
void x_Reset()
Reset all data counters (internal version without locking).
TTicks n_ticks
Number of ticks in this histogram.
TTimeBins GetHistograms() const
Histograms – in the order from the most recent to the least recent.
void AddRightScale(TValue max_value, unsigned n_bins, EScaleType scale_type)
@ eJuran
Juran's "Quality Control Handbook" that provide guidelines to select the number of bins for histogram...
@ eRice
Rice's rule. Presented as a simple alternative to Sturge's rule.
@ eSquareRoot
Square root rule.
@ eSturge
Herbert Sturge's rule.
@ eLog2
Binary logarithmic scale with a base 2.
@ eLog10
Common logarithmic scale with a base 10.
@ eLog
Natural logarithmic scale with a base e ~ 2.72.
@ eLinear
Arithmetic or linear scale.
@ eCloneStructureOnly
Clone structure only (the counters will be zeroed)
@ eCloneAll
Clone whole histogram, with scale and counters.
@ eMonotonic
Use specified scale method to calculate bins sizes from a minimum to a maximum value.
@ eSymmetrical
Determine a mean for a specified value range and calculates bins sizes using specified scale to both ...
unsigned int
A callback function used to compare two keys in a database.
Definition: types.hpp:1210
int i
yy_size_t n
#define S(x, n)
const GenericPointer< typename T::ValueType > T2 value
Definition: pointer.h:1227
#define min_value(a, b)
Definition: ncbi_c_log.c:140
#define fabs(v)
Definition: ncbi_dispd.c:46
unsigned int a
Definition: ncbi_localip.c:102
EIPRangeType t
Definition: ncbi_localip.c:101
The NCBI C++/STL use hints.
T log2(T x_)
T max(T x_, T y_)
T log10(T x_)
T min(T x_, T y_)
Uint4 end_value
A histograms which covers a certain number of ticks.
Helper template to check that type Type have some method declared using TypeChecker<Type>.
Definition: ncbistl.hpp:263
#define _TROUBLE
#define _ASSERT
Modified on Mon Apr 22 04:07:24 2024 by modify_doxy.py rev. 669887