Data Modeling (skhep.modeling)¶
Module for algorithms and methods used to model distributions.
This module contains in particular:
 Bayesian Blocks binning algorithm.
Bayesian Block implementation¶
Dynamic programming algorithm for finding the optimal adaptivewidth histogram.
 Based on Scargle et al 2012 [1]
 Initial Python Implementation [2]
 AstroML Implementation [3]
 Initial Examination in HEP context [4]
References
[1]  (1, 2) http://adsabs.harvard.edu/abs/2012arXiv1207.5578S 
[2]  http://jakevdp.github.com/blog/2012/09/12/dynamicprogramminginpython/ 
[3]  https://github.com/astroML/astroML/blob/master/astroML/density_estimation/bayesian_blocks.py 
[4]  https://arxiv.org/abs/1708.00810 

class
skhep.modeling.bayesian_blocks.
Prior
(p0=0.05, gamma=None)¶ Helper class for calculating the prior on the fitness function.

skhep.modeling.bayesian_blocks.
bayesian_blocks
(data, weights=None, p0=0.05, gamma=None)¶ Bayesian Blocks Implementation.
This is a flexible implementation of the Bayesian Blocks algorithm described in Scargle 2012 [1]. It has been modified to natively accept weighted events, for ease of use in HEP applications.
Parameters:  data (array) – Input data values (one dimensional, length N). Repeat values are allowed.
 weights (array_like, optional) –
Weights for data (otherwise assume all data points have a weight of 1). Must be same length as data.
Defaults to None.
 p0 (float, optional) –
Falsepositive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for falsepositive bin edges. In general, the larger the number of bins, the small the p0 should be to prevent the creation of spurious, jagged bins.
Defaults to 0.05.
 gamma (float, optional) –
If specified, then use this gamma to compute the general prior form, p ~ gamma^N. If gamma is specified, p0 is ignored.
Defaults to None.
Returns: Array containing the (N+1) bin edges
Return type: edges (ndarray)
Examples
Unweighted data:
>>> d = np.random.normal(size=100) >>> bins = bayesian_blocks(d, p0=0.01)
Unweighted data with repeats:
>>> d = np.random.normal(size=100) >>> d[80:] = d[:20] >>> bins = bayesian_blocks(d, p0=0.01)
Weighted data:
>>> d = np.random.normal(size=100) >>> w = np.random.uniform(1,2, size=100) >>> bins = bayesian_blocks(d, w, p0=0.01)
See also
 skhep.visual.MplPlotter.hist:
 Histogram plotting function which can natively make use of bayesian blocks.