# Data Modeling (skhep.modeling)¶

Module for algorithms and methods used to model distributions.

This module contains in particular:

• Bayesian Blocks binning algorithm.

## Bayesian Block implementation¶

Dynamic programming algorithm for finding the optimal adaptive-width histogram.

• Based on Scargle et al 2012 [1]
• Initial Python Implementation [2]
• AstroML Implementation [3]
• Initial Examination in HEP context [4]

References

class skhep.modeling.bayesian_blocks.Prior(p0=0.05, gamma=None)

Helper class for calculating the prior on the fitness function.

skhep.modeling.bayesian_blocks.bayesian_blocks(data, weights=None, p0=0.05, gamma=None)

Bayesian Blocks Implementation.

This is a flexible implementation of the Bayesian Blocks algorithm described in Scargle 2012 [1]. It has been modified to natively accept weighted events, for ease of use in HEP applications.

Parameters: data (array) – Input data values (one dimensional, length N). Repeat values are allowed. weights (array_like, optional) – Weights for data (otherwise assume all data points have a weight of 1). Must be same length as data. Defaults to None. p0 (float, optional) – False-positive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for false-positive bin edges. In general, the larger the number of bins, the small the p0 should be to prevent the creation of spurious, jagged bins. Defaults to 0.05. gamma (float, optional) – If specified, then use this gamma to compute the general prior form, p ~ gamma^N. If gamma is specified, p0 is ignored. Defaults to None. Array containing the (N+1) bin edges edges (ndarray)

Examples

Unweighted data:

>>> d = np.random.normal(size=100)
>>> bins = bayesian_blocks(d, p0=0.01)


Unweighted data with repeats:

>>> d = np.random.normal(size=100)
>>> d[80:] = d[:20]
>>> bins = bayesian_blocks(d, p0=0.01)


Weighted data:

>>> d = np.random.normal(size=100)
>>> w = np.random.uniform(1,2, size=100)
>>> bins = bayesian_blocks(d, w, p0=0.01)