Fitting a mixture of templates and parametric models
The class iminuit.cost.Template
supports fitting a mixture of templates and parametric models. This is useful if some components have a simple shape like a Gaussian peak, while other components are complicated and need to be estimated from simulation or a control measurement.
In this notebook, we demonstrate this usage. Our data consists of a Gaussian peak and exponential background. We fit the Gaussian peak with a parametric model and use a template to describe the exponential background.
[1]:
%config InlineBackend.figure_formats = ['svg']
from iminuit import Minuit
from iminuit.cost import Template, ExtendedBinnedNLL
from numba_stats import norm, truncexpon
import numpy as np
from matplotlib import pyplot as plt
We generate the toy data.
[2]:
rng = np.random.default_rng(1)
s = rng.normal(0.5, 0.05, size=1000)
b = rng.exponential(1, size=1000)
b = b[b < 1]
ns, xe = np.histogram(s, bins=100, range=(0, 1))
nb, _ = np.histogram(b, bins=xe)
n = ns + nb
plt.stairs(nb, xe, color="C1", fill=True, label="background")
plt.stairs(n, xe, baseline=nb, color="C0", fill=True, label="signal")
plt.legend();
Fitting a parametric component and a template
We now model the peaking component parametrically with a Gaussian. A template fit is an extended binned fit, so we need to provide a scaled cumulative density function like for iminuit.cost.ExtendedBinnedNLL
. To obtain a background template, we generate more samples from the exponential distribution and make a histogram.
[3]:
# signal model: scaled cdf of a normal distribution
def signal(xe, n, mu, sigma):
return n * norm.cdf(xe, mu, sigma)
# background template: histogram of MC simulation
rng = np.random.default_rng(2)
b2 = rng.exponential(1, size=1000)
b2 = b2[b2 < 1]
template = np.histogram(b2, bins=xe)[0]
# fit
c = Template(n, xe, (signal, template))
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1=100)
m.limits["x0_n", "x1", "x0_sigma"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[3]:
Migrad | |
---|---|
FCN = 87.01 (χ²/ndof = 0.9) | Nfcn = 148 |
EDM = 3.62e-05 (Goal: 0.0002) | |
Valid Minimum | Below EDM threshold (goal x 10) |
No parameters at limit | Below call limit |
Hesse ok | Covariance accurate |
Name | Value | Hesse Error | Minos Error- | Minos Error+ | Limit- | Limit+ | Fixed | |
---|---|---|---|---|---|---|---|---|
0 | x0_n | 990 | 40 | 0 | ||||
1 | x0_mu | 0.4951 | 0.0020 | 0 | 1 | |||
2 | x0_sigma | 0.0484 | 0.0018 | 0 | ||||
3 | x1 | 630 | 40 | 0 |
x0_n | x0_mu | x0_sigma | x1 | |
---|---|---|---|---|
x0_n | 1.43e+03 | 323e-6 (0.004) | 18.5304e-3 (0.274) | -0.4e3 (-0.284) |
x0_mu | 323e-6 (0.004) | 3.87e-06 | -0.1e-6 (-0.015) | -346e-6 (-0.004) |
x0_sigma | 18.5304e-3 (0.274) | -0.1e-6 (-0.015) | 3.21e-06 | -18.4785e-3 (-0.251) |
x1 | -0.4e3 (-0.284) | -346e-6 (-0.004) | -18.4785e-3 (-0.251) | 1.69e+03 |
The fit succeeded and the statistical uncertainty in the template is propagated into the result. You can play with this demo and see what happens if you increase the statistic of the template.
Note: the parameters of a parametric components are prefixed with xn_
where n
is the index of the component. This is to avoid name clashes between the parameter names of individual components and for clarity which parameter belongs to which component.
Extreme case: Fitting two parametric components
Although this is not recommended, you can also use the Template
class with two parametric components and no template. If you are in that situation, however, it is simpler and more efficient to use iminuit.cost.ExtendedBinnedNLL
. The following snipped is therefore just a proof that iminuit.cost.Template
handles this limiting case as well.
[4]:
# signal model: scaled cdf of a normal distribution
def signal(xe, n, mu, sigma):
return n * norm.cdf(xe, mu, sigma)
# background model: scaled cdf a an exponential distribution
def background(xe, n, mu):
return n * truncexpon.cdf(xe, xe[0], xe[-1], 0, mu)
# fit
c = Template(n, xe, (signal, background))
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1_n=100, x1_mu=1)
m.limits["x0_n", "x1_n", "x0_sigma", "x1_mu"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[4]:
Migrad | |
---|---|
FCN = 92.87 (χ²/ndof = 1.0) | Nfcn = 190 |
EDM = 1.67e-05 (Goal: 0.0002) | |
Valid Minimum | Below EDM threshold (goal x 10) |
No parameters at limit | Below call limit |
Hesse ok | Covariance accurate |
Name | Value | Hesse Error | Minos Error- | Minos Error+ | Limit- | Limit+ | Fixed | |
---|---|---|---|---|---|---|---|---|
0 | x0_n | 990 | 40 | 0 | ||||
1 | x0_mu | 0.4964 | 0.0019 | 0 | 1 | |||
2 | x0_sigma | 0.0487 | 0.0016 | 0 | ||||
3 | x1_n | 629 | 29 | 0 | ||||
4 | x1_mu | 0.97 | 0.14 | 0 |
x0_n | x0_mu | x0_sigma | x1_n | x1_mu | |
---|---|---|---|---|---|
x0_n | 1.3e+03 | 424.3e-6 (0.006) | 12.2549e-3 (0.209) | -0.3e3 (-0.252) | -0.369 (-0.076) |
x0_mu | 424.3e-6 (0.006) | 3.44e-06 | -0.1e-6 (-0.032) | 292.7e-6 (0.005) | -16.4e-6 (-0.065) |
x0_sigma | 12.2549e-3 (0.209) | -0.1e-6 (-0.032) | 2.64e-06 | -12.8693e-3 (-0.271) | -16.4e-6 (-0.075) |
x1_n | -0.3e3 (-0.252) | 292.7e-6 (0.005) | -12.8693e-3 (-0.271) | 851 | 0.337 (0.085) |
x1_mu | -0.369 (-0.076) | -16.4e-6 (-0.065) | -16.4e-6 (-0.075) | 0.337 (0.085) | 0.0183 |
The result is identical when we fit with iminuit.cost.ExtendedBinnedNLL
.
[5]:
def total(xe, x0_n, x0_mu, x0_sigma, x1_n, x1_mu):
return signal(xe, x0_n, x0_mu, x0_sigma) + background(xe, x1_n, x1_mu)
c = ExtendedBinnedNLL(n, xe, total)
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1_n=100, x1_mu=1)
m.limits["x0_n", "x1_n", "x0_sigma", "x1_mu"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[5]:
Migrad | |
---|---|
FCN = 92.87 (χ²/ndof = 1.0) | Nfcn = 190 |
EDM = 1.67e-05 (Goal: 0.0002) | |
Valid Minimum | Below EDM threshold (goal x 10) |
No parameters at limit | Below call limit |
Hesse ok | Covariance accurate |
Name | Value | Hesse Error | Minos Error- | Minos Error+ | Limit- | Limit+ | Fixed | |
---|---|---|---|---|---|---|---|---|
0 | x0_n | 990 | 40 | 0 | ||||
1 | x0_mu | 0.4964 | 0.0019 | 0 | 1 | |||
2 | x0_sigma | 0.0487 | 0.0016 | 0 | ||||
3 | x1_n | 629 | 29 | 0 | ||||
4 | x1_mu | 0.97 | 0.14 | 0 |
x0_n | x0_mu | x0_sigma | x1_n | x1_mu | |
---|---|---|---|---|---|
x0_n | 1.3e+03 | 424.3e-6 (0.006) | 12.2550e-3 (0.209) | -0.3e3 (-0.252) | -0.369 (-0.076) |
x0_mu | 424.3e-6 (0.006) | 3.44e-06 | -0.1e-6 (-0.032) | 292.7e-6 (0.005) | -16.4e-6 (-0.065) |
x0_sigma | 12.2550e-3 (0.209) | -0.1e-6 (-0.032) | 2.64e-06 | -12.8693e-3 (-0.271) | -16.4e-6 (-0.075) |
x1_n | -0.3e3 (-0.252) | 292.7e-6 (0.005) | -12.8693e-3 (-0.271) | 851 | 0.337 (0.085) |
x1_mu | -0.369 (-0.076) | -16.4e-6 (-0.065) | -16.4e-6 (-0.075) | 0.337 (0.085) | 0.0183 |