# Data Modeling (skhep.modeling)¶

Module for algorithms and methods used to model distributions.

This module contains in particular:

• Bayesian Blocks binning algorithm.

## Bayesian Block implementation¶

Dynamic programming algorithm for finding the optimal adaptive-width histogram.

• Based on Scargle et al 2012 [1]
• Initial Python Implementation [2]
• AstroML Implementation [3]

References

skhep.modeling.bayesian_blocks.bayesian_blocks(data, weights=None, p0=0.05, gamma=None)

Bayesian Blocks Implementation.

This is a flexible implementation of the Bayesian Blocks algorithm described in Scargle 2012 [1]. It has been modified to natively accept weighted events, for ease of use in HEP applications.

Parameters: data (array) – Input data values (one dimensional, length N). Repeat values are allowed. weights (array_like, optional) – Weights for data (otherwise assume all data points have a weight of 1). Must be same length as data. Defaults to None. p0 (float, optional) – False-positive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for false-positive bin edges. Defaults to 0.05. gamma (float, optional) – If specified, then use this gamma to compute the general prior form, p ~ gamma^N. If gamma is specified, p0 is ignored. Defaults to None. Array containing the (N+1) bin edges edges (ndarray)

Examples

Event data:

>>> d = np.random.normal(size=100)
>>> bins = bayesian_blocks(d, p0=0.01)

Event data with repeats:

>>> d = np.random.normal(size=100)
>>> d[80:] = d[:20]
>>> bins = bayesian_blocks(d, p0=0.01)

Event data with weights:

>>> d = np.random.normal(size=100)
>>> w = np.random.uniform(1,2, size=100)
>>> bins = bayesian_blocks(d, w, p0=0.01)