hepstats.modeling.bayesian_blocks module#

Bayesian Block implementation#

Dynamic programming algorithm for finding the optimal adaptive-width histogram. Modified from the bayesian blocks python implementation found in astroML [VCIG12].

  • Based on Scargle et al 2012 [SNJC13]

  • Initial Python Implementation [BB_]

  • Initial Examination in HEP context [PBS17]

class hepstats.modeling.bayesian_blocks.Prior(p0=0.05, gamma=None)[source]#

Bases: object

Helper class for calculating the prior on the fitness function.

Parameters:
  • p0 (float) – False-positive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for false-positive bin edges. In general, the larger the number of bins, the small the p0 should be to prevent the creation of spurious, jagged bins. Defaults to 0.05.

  • gamma (float | None) – If specified, then use this gamma to compute the general prior form, \(p \sim \gamma^N\). If gamma is specified, p0 is ignored. Defaults to None.

calc(N)[source]#

Computes the prior.

Parameters:

N (int) – N-th change point.

Return type:

float

Returns:

the prior.

hepstats.modeling.bayesian_blocks.bayesian_blocks(data, weights=None, p0=0.05, gamma=None)[source]#

Bayesian Blocks Implementation.

This is a flexible implementation of the Bayesian Blocks algorithm described in [SNJC13]. It has been modified to natively accept weighted events, for ease of use in HEP applications.

Parameters:
  • data (Iterable | ndarray) – Input data values (one dimensional, length N). Repeat values are allowed.

  • weights (Iterable | ndarray | None) – Weights for data (otherwise assume all data points have a weight of 1). Must be same length as data. Defaults to None.

  • p0 (float) – False-positive rate, between 0 and 1. A lower number places a stricter penalty against creating more bin edges, thus reducing the potential for false-positive bin edges. In general, the larger the number of bins, the small the p0 should be to prevent the creation of spurious, jagged bins. Defaults to 0.05.

  • gamma (float | None) – If specified, then use this gamma to compute the general prior form, \(p \sim \gamma^N\). If gamma is specified, p0 is ignored. Defaults to None.

Return type:

ndarray

Returns:

Array containing the (N+1) bin edges

Examples

Unweighted data:

>>> d = np.random.normal(size=100)
>>> bins = bayesian_blocks(d, p0=0.01)

Unweighted data with repeats:

>>> d = np.random.normal(size=100)
>>> d[80:] = d[:20]
>>> bins = bayesian_blocks(d, p0=0.01)

Weighted data:

>>> d = np.random.normal(size=100)
>>> w = np.random.uniform(1,2, size=100)
>>> bins = bayesian_blocks(d, w, p0=0.01)