Data Modeling (skhep.modeling)

Module for algorithms and methods used to model distributions.

This module contains in particular:

  • Bayesian Blocks binning algorithm.

Bayesian Block implementation

Dynamic programming algorithm for finding the optimal adaptive-width histogram.

  • Based on Scargle et al 2012 [1]
  • Initial Python Implementation [2]
  • AstroML Implementation [3]

References

[1](1, 2) http://adsabs.harvard.edu/abs/2012arXiv1207.5578S
[2]http://jakevdp.github.com/blog/2012/09/12/dynamic-programming-in-python/
[3]https://github.com/astroML/astroML/blob/master/astroML/density_estimation/bayesian_blocks.py
skhep.modeling.bayesian_blocks.bayesian_blocks(data, weights=None, p0=0.05, gamma=None)

Bayesian Blocks Implementation.

This is a flexible implementation of the Bayesian Blocks algorithm described in Scargle 2012 [1]. It has been modified to natively accept weighted events, for ease of use in HEP applications.

Parameters:
  • data (array) – Input data values (one dimensional, length N). Repeat values are allowed.
  • weights (array_like, optional) –

    Weights for data (otherwise assume all data points have a weight of 1). Must be same length as data.

    Defaults to None.

  • p0 (float, optional) –

    False-positive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for false-positive bin edges.

    Defaults to 0.05.

  • gamma (float, optional) –

    If specified, then use this gamma to compute the general prior form, p ~ gamma^N. If gamma is specified, p0 is ignored.

    Defaults to None.

Returns:

Array containing the (N+1) bin edges

Return type:

edges (ndarray)

Examples

Event data:

>>> d = np.random.normal(size=100)
>>> bins = bayesian_blocks(d, p0=0.01)

Event data with repeats:

>>> d = np.random.normal(size=100)
>>> d[80:] = d[:20]
>>> bins = bayesian_blocks(d, p0=0.01)

Event data with weights:

>>> d = np.random.normal(size=100)
>>> w = np.random.uniform(1,2, size=100)
>>> bins = bayesian_blocks(d, w, p0=0.01)

See also

skhep.visual.MplPlotter.hist:
Histogram plotting function which can natively make use of bayesian blocks.