Fitting a mixture of templates and parametric models

The class iminuit.cost.Template supports fitting a mixture of templates and parametric models. This is useful if some components have a simple shape like a Gaussian peak, while other components are complicated and need to be estimated from simulation or a control measurement.

In this notebook, we demonstrate this usage. Our data consists of a Gaussian peak and exponential background. We fit the Gaussian peak with a parametric model and use a template to describe the exponential background.

[1]:
%config InlineBackend.figure_formats = ['svg']
from iminuit import Minuit
from iminuit.cost import Template, ExtendedBinnedNLL
from numba_stats import norm, truncexpon
import numpy as np
from matplotlib import pyplot as plt

We generate the toy data.

[2]:
rng = np.random.default_rng(1)

s = rng.normal(0.5, 0.05, size=1000)
b = rng.exponential(1, size=1000)
b = b[b < 1]

ns, xe = np.histogram(s, bins=100, range=(0, 1))
nb, _ = np.histogram(b, bins=xe)
n = ns + nb

plt.stairs(nb, xe, color="C1", fill=True, label="background")
plt.stairs(n, xe, baseline=nb, color="C0", fill=True, label="signal")
plt.legend();
../_images/notebooks_template_model_mix_3_0.svg

Fitting a parametric component and a template

We now model the peaking component parametrically with a Gaussian. A template fit is an extended binned fit, so we need to provide a scaled cumulative density function like for iminuit.cost.ExtendedBinnedNLL. To obtain a background template, we generate more samples from the exponential distribution and make a histogram.

[3]:
# signal model: scaled cdf of a normal distribution
def signal(xe, n, mu, sigma):
    return n * norm.cdf(xe, mu, sigma)


# background template: histogram of MC simulation
rng = np.random.default_rng(2)
b2 = rng.exponential(1, size=1000)
b2 = b2[b2 < 1]
template = np.histogram(b2, bins=xe)[0]

# fit
c = Template(n, xe, (signal, template))
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1=100)
m.limits["x0_n", "x1", "x0_sigma"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[3]:
Migrad
FCN = 87.01 (χ²/ndof = 0.9) Nfcn = 148
EDM = 3.62e-05 (Goal: 0.0002)
Valid Minimum Below EDM threshold (goal x 10)
No parameters at limit Below call limit
Hesse ok Covariance accurate
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 x0_n 990 40 0
1 x0_mu 0.4951 0.0020 0 1
2 x0_sigma 0.0484 0.0018 0
3 x1 630 40 0
x0_n x0_mu x0_sigma x1
x0_n 1.43e+03 323e-6 (0.004) 18.5304e-3 (0.274) -0.4e3 (-0.284)
x0_mu 323e-6 (0.004) 3.87e-06 -0.1e-6 (-0.015) -346e-6 (-0.004)
x0_sigma 18.5304e-3 (0.274) -0.1e-6 (-0.015) 3.21e-06 -18.4785e-3 (-0.251)
x1 -0.4e3 (-0.284) -346e-6 (-0.004) -18.4785e-3 (-0.251) 1.69e+03
2024-10-12T13:50:46.774589 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

The fit succeeded and the statistical uncertainty in the template is propagated into the result. You can play with this demo and see what happens if you increase the statistic of the template.

Note: the parameters of a parametric components are prefixed with xn_ where n is the index of the component. This is to avoid name clashes between the parameter names of individual components and for clarity which parameter belongs to which component.

Extreme case: Fitting two parametric components

Although this is not recommended, you can also use the Template class with two parametric components and no template. If you are in that situation, however, it is simpler and more efficient to use iminuit.cost.ExtendedBinnedNLL. The following snipped is therefore just a proof that iminuit.cost.Template handles this limiting case as well.

[4]:
# signal model: scaled cdf of a normal distribution
def signal(xe, n, mu, sigma):
    return n * norm.cdf(xe, mu, sigma)


# background model: scaled cdf a an exponential distribution
def background(xe, n, mu):
    return n * truncexpon.cdf(xe, xe[0], xe[-1], 0, mu)


# fit
c = Template(n, xe, (signal, background))
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1_n=100, x1_mu=1)
m.limits["x0_n", "x1_n", "x0_sigma", "x1_mu"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[4]:
Migrad
FCN = 92.87 (χ²/ndof = 1.0) Nfcn = 190
EDM = 1.67e-05 (Goal: 0.0002)
Valid Minimum Below EDM threshold (goal x 10)
No parameters at limit Below call limit
Hesse ok Covariance accurate
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 x0_n 990 40 0
1 x0_mu 0.4964 0.0019 0 1
2 x0_sigma 0.0487 0.0016 0
3 x1_n 629 29 0
4 x1_mu 0.97 0.14 0
x0_n x0_mu x0_sigma x1_n x1_mu
x0_n 1.3e+03 424.3e-6 (0.006) 12.2549e-3 (0.209) -0.3e3 (-0.252) -0.369 (-0.076)
x0_mu 424.3e-6 (0.006) 3.44e-06 -0.1e-6 (-0.032) 292.7e-6 (0.005) -16.4e-6 (-0.065)
x0_sigma 12.2549e-3 (0.209) -0.1e-6 (-0.032) 2.64e-06 -12.8693e-3 (-0.271) -16.4e-6 (-0.075)
x1_n -0.3e3 (-0.252) 292.7e-6 (0.005) -12.8693e-3 (-0.271) 851 0.337 (0.085)
x1_mu -0.369 (-0.076) -16.4e-6 (-0.065) -16.4e-6 (-0.075) 0.337 (0.085) 0.0183
2024-10-12T13:50:46.928390 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/

The result is identical when we fit with iminuit.cost.ExtendedBinnedNLL.

[5]:
def total(xe, x0_n, x0_mu, x0_sigma, x1_n, x1_mu):
    return signal(xe, x0_n, x0_mu, x0_sigma) + background(xe, x1_n, x1_mu)


c = ExtendedBinnedNLL(n, xe, total)
m = Minuit(c, x0_n=500, x0_mu=0.5, x0_sigma=0.05, x1_n=100, x1_mu=1)
m.limits["x0_n", "x1_n", "x0_sigma", "x1_mu"] = (0, None)
m.limits["x0_mu"] = (0, 1)
m.migrad()
[5]:
Migrad
FCN = 92.87 (χ²/ndof = 1.0) Nfcn = 190
EDM = 1.67e-05 (Goal: 0.0002)
Valid Minimum Below EDM threshold (goal x 10)
No parameters at limit Below call limit
Hesse ok Covariance accurate
Name Value Hesse Error Minos Error- Minos Error+ Limit- Limit+ Fixed
0 x0_n 990 40 0
1 x0_mu 0.4964 0.0019 0 1
2 x0_sigma 0.0487 0.0016 0
3 x1_n 629 29 0
4 x1_mu 0.97 0.14 0
x0_n x0_mu x0_sigma x1_n x1_mu
x0_n 1.3e+03 424.3e-6 (0.006) 12.2550e-3 (0.209) -0.3e3 (-0.252) -0.369 (-0.076)
x0_mu 424.3e-6 (0.006) 3.44e-06 -0.1e-6 (-0.032) 292.7e-6 (0.005) -16.4e-6 (-0.065)
x0_sigma 12.2550e-3 (0.209) -0.1e-6 (-0.032) 2.64e-06 -12.8693e-3 (-0.271) -16.4e-6 (-0.075)
x1_n -0.3e3 (-0.252) 292.7e-6 (0.005) -12.8693e-3 (-0.271) 851 0.337 (0.085)
x1_mu -0.369 (-0.076) -16.4e-6 (-0.065) -16.4e-6 (-0.075) 0.337 (0.085) 0.0183
2024-10-12T13:50:47.060721 image/svg+xml Matplotlib v3.9.2, https://matplotlib.org/