hepstats.hypotests.calculators.asymptotic_calculator module#

hepstats.hypotests.calculators.asymptotic_calculator.generate_asimov_hist(model, params, nbins=None)[source]#

Generate the Asimov histogram using a model and dictionary of parameters.

Parameters:
  • model – model used to generate the dataset.

  • params (dict[Any, dict[str, Any]]) – values of the parameters of the models.

  • nbins (int | None) – number of bins.

Return type:

tuple[ndarray, ndarray]

Returns:

Tuple of hist and bin_edges.

Example with zfit:
>>> obs = zfit.Space('x', limits=(0.1, 2.0))
>>> mean = zfit.Parameter("mu", 1.2)
>>> sigma = zfit.Parameter("sigma", 0.1)
>>> model = zfit.pdf.Gauss(obs=obs, mu=mean, sigma=sigma)
>>> hist, bin_edges = generate_asimov_hist(model, {"mean": 1.2, "sigma": 0.1})
hepstats.hypotests.calculators.asymptotic_calculator.generate_asimov_dataset(data, model, is_binned, nbins, values)[source]#

Generate the Asimov dataset using a model and dictionary of parameters.

Parameters:
  • data – Data, the same class should be used for the generated dataset.

  • model – Model to use for the generation. Can be binned or unbinned.

  • is_binned – If the model is binned.

  • nbins – Number of bins for the asimov dataset.

  • values – Dictionary of parameters values.

Returns:

Dataset with the asimov dataset.

class hepstats.hypotests.calculators.asymptotic_calculator.AsymptoticCalculator(input, minimizer, asimov_bins=None)[source]#

Bases: BaseCalculator

Class for asymptotic calculators, using asymptotic formulae of the likelihood ratio described in [CCGV11]. Can be used only with one parameter of interest.

Asymptotic calculator class using Wilk’s and Wald’s asymptotic formulae.

The asympotic formula is significantly faster than the Frequentist calculator, as it does not require the calculation of the frequentist p-value, which involves the calculation of toys (sample-and-fit).

Parameters:
  • input – loss or fit result.

  • minimizer – minimizer to use to find the minimum of the loss function.

  • asimov_bins (int | list[int] | None) – number of bins of the Asimov dataset.

Example with zfit:
>>> import zfit
>>> from zfit.loss import UnbinnedNLL
>>> from zfit.minimize import Minuit
>>>
>>> obs = zfit.Space('x', limits=(0.1, 2.0))
>>> data = zfit.data.Data.from_numpy(obs=obs, array=np.random.normal(1.2, 0.1, 10000))
>>> mean = zfit.Parameter("mu", 1.2)
>>> sigma = zfit.Parameter("sigma", 0.1)
>>> model = zfit.pdf.Gauss(obs=obs, mu=mean, sigma=sigma)
>>> loss = UnbinnedNLL(model=model, data=data)
>>>
>>> calc = AsymptoticCalculator(input=loss, minimizer=Minuit(), asimov_bins=100)
UNBINNED_TO_BINNED_LOSS = {}#
static check_pois(pois)[source]#

Checks if the parameter of interest is a hepstats.parameters.POIarray instance.

Parameters:

pois (POI | POIarray) – the parameter of interest to check.

Raises:

TypeError – if pois is not an instance of hepstats.parameters.POIarray.

asimov_dataset(poi, ntrials_fit=None)[source]#

Gets the Asimov dataset for a given alternative hypothesis.

Parameters:
  • poi (POI) – parameter of interest of the alternative hypothesis.

  • ntrials_fit (int | None) – (default: 5) maximum number of fits to perform

Returns:

The asymov dataset.

Example with zfit:
>>> poialt = POI(mean, 1.2)
>>> dataset = calc.asimov_dataset(poialt)
asimov_loss(poi)[source]#

Constructs a loss function using the Asimov dataset for a given alternative hypothesis.

Parameters:

poi (POI) – parameter of interest of the alternative hypothesis.

Returns:

Loss function.

Example with zfit:
>>> poialt = POI(mean, 1.2)
>>> loss = calc.asimov_loss(poialt)
asimov_nll(pois, poialt)[source]#

Computes negative log-likelihood values for given parameters of interest using the Asimov dataset generated with a given alternative hypothesis.

Parameters:
  • pois (POIarray) – parameters of interest.

  • poialt (POI) – parameter of interest of the alternative hypothesis.

Return type:

ndarray

Returns:

Array of nll values for the alternative hypothesis.

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poinull = POIarray(mean, [1.1, 1.2, 1.0])
>>> poialt = POI(mean, 1.2)
>>> nll = calc.asimov_nll(poinull, poialt)
pnull(qobs, qalt=None, onesided=True, onesideddiscovery=False, qtilde=False, nsigma=0)[source]#

Computes the pvalue for the null hypothesis.

Parameters:
  • qobs (ndarray) – observed values of the test-statistic q.

  • qalt (ndarray | None) – alternative values of the test-statistic q using the asimov dataset.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery.

  • qtilde (bool) – if True use the \(\widetilde{q}\) test statistics else (default) use the \(q\) test statistic.

  • nsigma (int) – significance shift.

Return type:

ndarray

Returns:

Array of the pvalues for the null hypothesis.

qalt(poinull, poialt, onesided, onesideddiscovery, qtilde=False)[source]#

Computes alternative hypothesis values of the \(\Delta\) log-likelihood test statistic using the asimov dataset.

Parameters:
  • poinull (POIarray) – parameters of interest for the null hypothesis.

  • poialt (POI) – parameters of interest for the alternative hypothesis.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery test.

  • qtilde (bool) – if True use the \(\widetilde{q}\) test statistics else (default) use the \(q\) test statistic.

Return type:

ndarray

Returns:

Q values for the alternative hypothesis.

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poinull = POI(mean, [1.1, 1.2, 1.0])
>>> poialt = POI(mean, [1.2])
>>> q = calc.qalt(poinull, poialt)
palt(qobs, qalt, onesided=True, onesideddiscovery=False, qtilde=False)[source]#

Computes the pvalue for the alternative hypothesis.

Parameters:
  • qobs (ndarray) – observed values of the test-statistic q.

  • qalt (ndarray) – alternative values of the test-statistic q using the Asimov dataset.

  • onesided (int) – if True (default) computes onesided pvalues.

  • onesideddiscovery (int) – if True (default) computes onesided pvalues for a discovery.

  • qtilde (int) – if True use the \(\widetilde{q}\) test statistics else (default) use the \(q\) test statistic.

Return type:

ndarray

Returns:

Array of the pvalues for the alternative hypothesis.

property bestfit#

Returns the best fit values of the model parameters.

static check_pois_compatibility(poi1, poi2)#

Checks compatibility between two lists of hepstats.parameters.POIarray() instances.

Parameters:
  • poi1 (POI | POIarray) – the first parameter of interest.

  • poi2 (POI | POIarray) – the second parameter of interest.

Raises:

ValueError – if the two parameters of interests don’t have the same parameters, check by their names.

property constraints#

Returns the constraints on the loss / likehood function.

property data#

Returns the data.

expected_pvalue(poinull, poialt, nsigma, CLs=False, qtilde=False, onesided=True, onesideddiscovery=False)#

Computes the expected pvalues and error bands for different values of \(\sigma\) (0=expected/median)

Parameters:
  • poinull (POI | POIarray) – parameters of interest for the null hypothesis.

  • poialt (POI | POIarray) – parameters of interest for the alternative hypothesis.

  • nsigma (list[int]) – list of values of \(\sigma\) to compute the expected pvalue.

  • CLs (bool) – if True computes pvalues as \(p_{cls}=p_{null}/p_{alt}=p_{clsb}/p_{clb}\) else as \(p_{clsb} = p_{null}\).

  • qtilde (bool) – if True use the \(\widetilde{q}\) test statistics else (default) use the \(q\) test statistic.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery.

Return type:

list[array]

Returns:

Array of expected pvalues for each \(\sigma\) value

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poinull = POI(mean, [1.1, 1.2, 1.0])
>>> poialt = POI(mean, 1.2)
>>> nll = calc.expected_pvalue(poinull, poialt)
get_parameter(name)#

Returns the parameter in loss function with given input name.

Parameters:

name (str) – name of the parameter to return

property loss#

Returns the loss / likelihood function.

lossbuilder(model, data, weights=None, oldloss=None)#

Method to build a new loss function.

Parameters:
  • model (*) – The model or models to evaluate the data on

  • data (*) – Data to use

  • weights (*) – the data weights

  • oldloss (*) – Previous loss that has data, models, type

Example with zfit:
>>> data = zfit.data.Data.from_numpy(obs=obs, array=np.random.normal(1.2, 0.1, 10000))
>>> mean = zfit.Parameter("mu", 1.2)
>>> sigma = zfit.Parameter("sigma", 0.1)
>>> model = zfit.pdf.Gauss(obs=obs, mu=mean, sigma=sigma)
>>> loss = calc.lossbuilder(model, data)
Returns:

Loss function

property minimizer#

Returns the minimizer.

property model#

Returns the model.

obs_nll(pois)#

Compute observed negative log-likelihood values for given parameters of interest.

Parameters:

pois (POIarray) – parameters of interest.

Return type:

ndarray

Returns:

Observed nll values.

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poi = POI(mean, [1.1, 1.2, 1.0])
>>> nll = calc.obs_nll(poi)
property parameters#

Returns the list of free parameters in loss / likelihood function.

pvalue(poinull, poialt=None, qtilde=False, onesided=True, onesideddiscovery=False)#

Computes pvalues for the null and alternative hypothesis.

Parameters:
  • poinull (POI | POIarray) – parameters of interest for the null hypothesis.

  • poialt (POI | None) – parameters of interest for the alternative hypothesis.

  • qtilde (bool) – if True use the \(\widetilde{q}\) test statistics else (default) use the \(q\) test statistic.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery test.

Return type:

tuple[ndarray, ndarray]

Returns:

Tuple of arrays for pnull, palt

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poinull = POI(mean, [1.1, 1.2, 1.0])
>>> poialt = POI(mean, 1.2)
>>> pvalues = calc.pavalue(poinull, poialt)
q(nll1, nll2, poi1, poi2, onesided=True, onesideddiscovery=False)#

Compute values of the test statistic q defined as the difference between negative log-likelihood values \(q = nll1 - nll2\).

Parameters:
  • nll1 (array) – array of nll values #1, evaluated with poi1.

  • nll2 (array) – array of nll values #2, evaluated with poi2.

  • poi1 (POIarray) – POI’s #1.

  • poi2 (POIarray) – POI’s #2.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery.

Return type:

ndarray

Returns:

Array of \(q\) values.

qobs(poinull, onesided=True, onesideddiscovery=True, qtilde=False)#

Computes observed values of the \(\Delta\) log-likelihood test statistic.

Parameters:
  • poinull (POI) – parameters of interest for the null hypothesis.

  • qtilde (bool) – if True use the \(\tilde{q}\) test statistics else (default) use the \(q\) test statistic.

  • onesided (bool) – if True (default) computes onesided pvalues.

  • onesideddiscovery (bool) – if True (default) computes onesided pvalues for a discovery test.

Returns:

Observed values of q.

Example with zfit:
>>> mean = zfit.Parameter("mu", 1.2)
>>> poi = POI(mean, [1.1, 1.2, 1.0])
>>> q = calc.qobs(poi)
set_params_to_bestfit()#

Set the values of the parameters in the models to the best fit values