Reference

bootstrap

Bootstrap resampling tools.

Compute estimator bias, variance, confidence intervals with bootstrap resampling.

Several forms of bootstrapping on N-dimensional data are supported: ordinary, balanced, extended, parametric, and stratified sampling, see resample() for details. Parametric bootstrapping fits a user-specified distribution to the data and samples from the parametric distribution. The distributions are taken from scipy.stats.

Confidence intervals can be computed with the ordinary percentile method and with the more efficient BCa method, see confidence_interval() for details.

resample.bootstrap.bootstrap(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike, **kwargs: Any) ndarray

Calculate function values from bootstrap samples.

This is equivalent to numpy.array([fn(b) for b in resample(sample)]) and implemented for convenience.

Parameters:
  • fn (Callable) – Bootstrap samples are passed to this function.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • **kwargs – Keywords are forwarded to resample().

Returns:

Results of fn applied to each bootstrap sample.

Return type:

np.array

Examples

>>> from resample.bootstrap import bootstrap
>>> import numpy as np
>>> x = np.arange(10)
>>> fx = np.mean(x)
>>> fb = bootstrap(np.mean, x, size=10000, random_state=1)
>>> print(f"f(x) = {fx:.1f} +/- {np.std(fb):.1f}")
f(x) = 4.5 +/- 0.9
resample.bootstrap.confidence_interval(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike, cl: float = 0.95, ci_method: str = 'bca', **kwargs: Any) Tuple[float, float]

Calculate bootstrap confidence intervals.

Parameters:
  • fn (callable) – Function to be bootstrapped.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • cl (float, default : 0.95) – Confidence level. Asymptotically, this is the probability that the interval contains the true value.

  • ci_method (str, {'bca', 'percentile'}, optional) – Confidence interval method. Default is ‘bca’. See notes for details.

  • **kwargs – Keyword arguments forwarded to resample().

Returns:

Upper and lower confidence limits.

Return type:

(float, float)

Examples

Compute confidence interval for arithmetic mean.

>>> from resample.bootstrap import confidence_interval
>>> import numpy as np
>>> x = np.arange(10)
>>> a, b = confidence_interval(np.mean, x, size=10000, random_state=1)
>>> f"{a:.1f} to {b:.1f}"
'2.6 to 6.2'

Notes

Both the ‘percentile’ and ‘bca’ methods produce intervals that are invariant to monotonic transformations of the data values, a desirable and consistent property.

The ‘percentile’ method is straightforward and useful as a fallback. The ‘bca’ method is 2nd order accurate (to O(1/n) where n is the sample size) and generally preferred. It computes a jackknife estimate in addition to the bootstrap, which increases the number of function evaluations in a direct comparison to ‘percentile’. However the increase in accuracy should compensate for this, with the result that less bootstrap replicas are needed overall to achieve the same accuracy.

resample.bootstrap.covariance(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike, **kwargs: Any) ndarray

Calculate bootstrap estimate of covariance.

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • **kwargs – Keyword arguments forwarded to resample().

Returns:

Bootstrap estimate of covariance. In general, this is a matrix, but if the function maps to a scalar, it is scalar as well.

Return type:

ndarray

Examples

Compute covariance of sample mean and sample variance.

>>> from resample.bootstrap import variance
>>> import numpy as np
>>> x = np.arange(10)
>>> def fn(x):
...     return np.mean(x), np.var(x)
>>> np.round(covariance(fn, x, size=10000, random_state=1), 1)
array([[0.8, 0. ],
       [0. , 5.5]])
resample.bootstrap.resample(sample: ArrayLike, *args: ArrayLike, size: int = 100, method: str = 'balanced', strata: ArrayLike | None = None, random_state: int | Generator | None = None) Generator[ndarray, None, None]

Return generator of bootstrap samples.

Parameters:
  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • size (int, optional) – Number of bootstrap samples to generate. Default is 100.

  • method (str or None, optional) – How to generate bootstrap samples. Supported are ‘ordinary’, ‘balanced’, ‘extended’, or a distribution name for a parametric bootstrap. Default is ‘balanced’. Supported distribution names: ‘normal’ (also: ‘gaussian’, ‘norm’), ‘student’ (also: ‘t’), ‘laplace’, ‘logistic’, ‘F’ (also: ‘f’), ‘beta’, ‘gamma’, ‘log-normal’ (also: ‘lognorm’, ‘log-gaussian’), ‘inverse-gaussian’ (also: ‘invgauss’), ‘pareto’, ‘poisson’.

  • strata (array-like, optional) – Stratification labels. Must have the same shape as sample. Default is None.

  • random_state (numpy.random.Generator or int, optional) – Random number generator instance. If an integer is passed, seed the numpy default generator with it. Default is to use numpy.random.default_rng().

Yields:

ndarray – Bootstrap sample.

Examples

Compute uncertainty of arithmetic mean.

>>> from resample.bootstrap import resample
>>> import numpy as np
>>> x = np.arange(10)
>>> fx = np.mean(x)
>>> fb = []
>>> for b in resample(x, size=10000, random_state=1):
...     fb.append(np.mean(b))
>>> print(f"f(x) = {fx:.1f} +/- {np.std(fb):.1f}")
f(x) = 4.5 +/- 0.9

Compute uncertainty of function applied to multivariate data.

>>> from resample.bootstrap import resample
>>> import numpy as np
>>> x = np.arange(10)
>>> y = np.arange(10, 20)
>>> fx = np.mean((x, y))
>>> fb = []
>>> for bx, by in resample(x, y, size=10000, random_state=1):
...     fb.append(np.mean((bx, by)))
>>> print(f"f(x, y) = {fx:.1f} +/- {np.std(fb):.1f}")
f(x, y) = 9.5 +/- 0.9

Notes

Balanced vs. ordinary bootstrap:

The balanced bootstrap produces more accurate results for the same number of bootstrap samples than the ordinary bootstrap, but needs to allocate memory for B integers, where B is the number of bootstrap samples. Since values of B larger than 10000 are rarely needed, this is usually not an issue.

Non-parametric vs. parametric bootstrap:

If you know that the data follow a particular parametric distribution, it is better to sample from this parametric distribution, but in most cases it is sufficient and more convenient to do a non-parametric bootstrap (using “balanced”, “ordinary”, “extended”). The parametric bootstrap is essential for estimators sensitive to the tails of a distribution (for example, a quantile close to 0 or 1). In this case, only a parametric bootstrap will give reasonable answers, since the non-parametric bootstrap cannot include rare events in the tail if the original sample did not have them.

Extended bootstrap:

In particle physics and perhaps also in other fields, estimators are used which are that are a function of both the size and shape of a sample (for example, fit of a peak over smooth background to the mass distribution of decay candidates). In this case, the normal bootstrap (parametric or non-parametric) is not correct, since the sample size is kept constant. For such cases, one needs the “extended” bootstrap. The name alludes to the so-called extended maximum-likelihood (EML) method in particle physics. Estimates obtained with the EML need to be bootstrapped with the “extended” bootstrap.

Stratification:

If the sample consists of several distinct classes, stratification ensures that the relative proportions of each class are maintained in each replicated sample. This is a stricter constraint than that offered by the balanced bootstrap, which only guarantees that classes have the original proportions over all replicates, but not within each one replicate.

resample.bootstrap.variance(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike, **kwargs: Any) ndarray

Calculate bootstrap estimate of variance.

If the function returns a vector, the variance is computed elementwise.

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • **kwargs – Keyword arguments forwarded to resample().

Returns:

Bootstrap estimate of variance.

Return type:

ndarray

Examples

Compute variance of arithmetic mean.

>>> from resample.bootstrap import variance
>>> import numpy as np
>>> x = np.arange(10)
>>> v = variance(np.mean, x, size=10000, random_state=1)
>>> f"{v:.1f}"
'0.8'

jackknife

Jackknife resampling tools.

Compute estimator bias and variance with jackknife resampling. The implementation supports resampling of N-dimensional data. The interface of this module mimics that of the bootstrap module, so that you can easily switch between bootstrapping and jackknifing bias and variance of an estimator.

The jackknife is an approximation to the bootstrap, so in general bootstrapping is preferred, especially when the sample is small. The computational cost of the jackknife increases quadratically with the sample size, but only linearly for the bootstrap. An advantage of the jackknife can be the deterministic outcome, since no random sampling is involved, but this can be overcome by fixing the seed for the bootstrap.

The jackknife should be used to estimate the bias, since the bootstrap cannot (easily) estimate bias. The bootstrap should be preferred when computing the variance.

resample.jackknife.bias(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike) ndarray

Calculate jackknife estimate of bias.

The bias estimate is accurate to O(1/n), where n is the number of samples. If the bias is exactly O(1/n), then the estimate is exact.

Wikipedia: https://en.wikipedia.org/wiki/Jackknife_resampling

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

Returns:

Jackknife estimate of bias (= expectation of estimator - true value).

Return type:

ndarray

Examples

Compute bias of numpy.var with and without bias-correction.

>>> from resample.jackknife import bias
>>> import numpy as np
>>> x = np.arange(10)
>>> b1 = bias(np.var, x)
>>> b2 = bias(lambda x: np.var(x, ddof=1), x)
>>> f"bias of naive sample variance {b1:.1f}, bias of corrected variance {b2:.1f}"
'bias of naive sample variance -0.9, bias of corrected variance 0.0'
resample.jackknife.bias_corrected(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike) ndarray

Calculate bias-corrected estimate of the function with the jackknife.

Removes a bias of O(1/n), where n is the sample size, leaving bias of order O(1/n²). If the original function has a bias of exactly O(1/n), the corrected result is now unbiased.

Wikipedia: https://en.wikipedia.org/wiki/Jackknife_resampling

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

Returns:

Estimate with O(1/n) bias removed.

Return type:

ndarray

Examples

Compute bias-corrected estimate of numpy.var.

>>> from resample.jackknife import bias_corrected
>>> import numpy as np
>>> x = np.arange(10)
>>> v1 = np.var(x)
>>> v2 = bias_corrected(np.var, x)
>>> f"naive variance {v1:.1f}, bias-corrected variance {v2:.1f}"
'naive variance 8.2, bias-corrected variance 9.2'
resample.jackknife.cross_validation(predict: Callable[[...], float], x: ArrayLike, y: ArrayLike, *args: ArrayLike) float

Calculate mean-squared error of model with leave-one-out-cross-validation.

Wikipedia: https://en.wikipedia.org/wiki/Cross-validation_(statistics)

Parameters:
  • predict (callable) – Function with the signature (x_in, y_in, x_out, *args). It takes x_in, y_in, which are arrays with the same length. x_out should be one element of the x array. *args are further optional arguments for the function. The function should return the prediction y(x_out).

  • x (array-like) – Explanatory variable. Must be an array of shape (N, …), where N is the number of samples.

  • y (array-like) – Observations. Must be an array of shape (N, …).

  • *args – Optional arguments which are passed unmodified to predict.

Returns:

Variance of the difference (y[i] - predict(…, x[i], *args)).

Return type:

float

resample.jackknife.jackknife(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike) ndarray

Calculate jackknife estimates for a given sample and estimator.

The jackknife is a linear approximation to the bootstrap. In contrast to the bootstrap it is deterministic and does not use random numbers. The caveat is the computational cost of the jackknife, which is O(n²) for n observations, compared to O(n x k) for k bootstrap replicates. For large samples, the bootstrap is more efficient.

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

Returns:

Jackknife samples.

Return type:

ndarray

Examples

>>> from resample.jackknife import jackknife
>>> import numpy as np
>>> x = np.arange(10)
>>> fx = np.mean(x)
>>> fb = jackknife(np.mean, x)
>>> print(f"f(x) = {fx:.1f} +/- {np.std(fb):.1f}")
f(x) = 4.5 +/- 0.3
resample.jackknife.resample(sample: ArrayLike, *args: ArrayLike, copy: bool = True) Generator[Any, None, None]

Generate jackknifed samples.

Parameters:
  • sample (array-like) – Sample. If the sequence is multi-dimensional, the first dimension must walk over i.i.d. observations.

  • *args (array-like) – Optional additional arrays of the same length to resample.

  • copy (bool, optional) – If True, return the replicated sample as a copy, otherwise return a view into the internal array buffer of the generator. Setting this to False avoids len(sample) copies, which is more efficient, but see notes for caveats.

Yields:

ndarray – Array with same shape and type as input, but with the size of the first dimension reduced by one. Replicates are missing one value of the original in ascending order, e.g. for a sample (1, 2, 3), one gets (2, 3), (1, 3), (1, 2).

See also

resample.bootstrap.resample

Generate bootstrap samples.

resample.jackknife.jackknife

Generate jackknife estimates.

Notes

On performance:

The generator interally keeps a single array to the replicates, which is updated on each iteration of the generator. The safe default is to return copies of this internal state. To increase performance, it also possible to return a view into the generator state, by setting the copy=False. However, this will only produce correct results if the generator is called strictly sequentially in a single- threaded program and the loop body consumes the view and does not try to store it. The following program shows what happens otherwise:

>>> from resample.jackknife import resample
>>> r1 = []
>>> for x in resample((1, 2, 3)): # works as expected
...     r1.append(x)
>>> print(r1)
[array([2, 3]), array([1, 3]), array([1, 2])]
>>>
>>> r2 = []
>>> for x in resample((1, 2, 3), copy=False):
...     r2.append(x) # x is now a view into the same array in memory
>>> print(r2)
[array([1, 2]), array([1, 2]), array([1, 2])]
resample.jackknife.variance(fn: Callable[[...], ndarray], sample: ArrayLike, *args: ArrayLike) ndarray

Calculate jackknife estimate of variance.

Wikipedia: https://en.wikipedia.org/wiki/Jackknife_resampling

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Original sample.

  • *args (array-like) – Optional additional arrays of the same length to resample.

Returns:

Jackknife estimate of variance.

Return type:

ndarray

Examples

Compute variance of arithmetic mean.

>>> from resample.jackknife import variance
>>> import numpy as np
>>> x = np.arange(10)
>>> v = variance(np.mean, x)
>>> f"{v:.1f}"
'0.9'

permutation

Permutation-based tests.

A collection of statistical tests that use permutated samples. Permutations are used to compute the distribution of a test statistic under some null hypothesis to obtain p-values without relying on approximate asymptotic formulas.

The permutation method is generic, it can be used with any test statistic, therefore we also provide a generic test function that accepts a user-defined function to compute the test statistic and then automatically computes the p-value for that statistic. The other tests internally also call this generic test function.

All tests return a TestResult object, which mimics the interface of the result objects returned by tests in scipy.stats, but has a third field to return the estimated distribution of the test statistic under the null hypothesis.

Further reading:

class resample.permutation.TestResult(statistic: float, pvalue: float, samples: ndarray[Any, dtype[_ScalarType_co]])

Holder of the result of the permutation test.

This class acts like a tuple, which means its can be unpacked and the fields can be accessed by name or by index.

statistic

Value of the test statistic computed on the original data

Type:

float

pvalue

Estimated chance probability (aka Type I error) for rejecting the null hypothesis. See https://en.wikipedia.org/wiki/P-value for details.

Type:

float

samples

Values of the test statistic from the permutated samples.

Type:

array

resample.permutation.anova(x: ArrayLike, y: ArrayLike, *args: ArrayLike, **kwargs: Any) TestResult

Test whether the means of two or more samples are compatible.

This test uses one-way analysis of variance (one-way ANOVA) which tests whether the samples have the same mean. This test is typically used when one has three groups or more. For two groups, Welch’s ttest is preferred, because ANOVA assumes equal variances for the samples.

Parameters:
  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • *args (array-like) – Further samples.

  • **kwargs – Keyword arguments are forward to same_population().

Return type:

TestResult

Notes

https://en.wikipedia.org/wiki/One-way_analysis_of_variance https://en.wikipedia.org/wiki/F-test

resample.permutation.kruskal(x: ArrayLike, y: ArrayLike, *args: ArrayLike, **kwargs: Any) TestResult

Test whether two or more samples have the same mean rank.

This performs a permutation-based Kruskal-Wallis test. In a sense, it extends the Mann-Whitney U test, which also uses ranks, to more than two groups. It does so by comparing the means of the rank distributions.

Parameters:
  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • *args (array-like) – Further samples.

  • **kwargs – Keyword arguments are forward to same_population().

Return type:

TestResult

Notes

https://en.wikipedia.org/wiki/Kruskal%E2%80%93Wallis_one-way_analysis_of_variance

resample.permutation.pearsonr(x: ArrayLike, y: ArrayLike, **kwargs: Any) TestResult

Test whether two samples are drawn from same population using correlation.

The test statistic is the Pearson correlation coefficient. The test is very sensitive to linear relationship of x and y. If the relationship is very non-linear but monotonic, spearmanr() may be more sensitive.

https://en.wikipedia.org/wiki/Pearson_correlation_coefficient

Parameters:
  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • **kwargs – Keyword arguments are forward to same_population().

Return type:

TestResult

resample.permutation.same_population(fn: Callable[[...], float], x: ArrayLike, y: ArrayLike, *args: ArrayLike, transform: Callable[[ndarray[Any, dtype[_ScalarType_co]]], ndarray[Any, dtype[_ScalarType_co]]] | None = None, size: int = 9999, random_state: int | Generator | None = None) TestResult

Compute p-value for hypothesis that samples originate from same population.

The computation is based on a user-defined test statistic. The distribution of the test statistic under the null hypothesis is generated by generating random permutations of the inputs, to simulate that they are actually drawn from the same population. The test statistic is recomputed on these permutations and the p-value is computed as the fraction of these resampled test statistics which are larger than the original value.

Some test statistics need to be transformed to fulfill the condition above, for example if they are signed. A transform can be passed to this function for those cases.

Parameters:
  • fn (Callable) – Function with signature f(x, …), where the number of arguments corresponds to the number of data samples passed to the test.

  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • *args (array-like) – Further samples, if the test allows to compare more than two.

  • transform (Callable, optional) – Function with signature f(x) for the test statistic to turn it into a measure of deviation. Must be vectorised.

  • size (int, optional) – Number of permutations. Default 9999.

  • random_state (numpy.random.Generator or int, optional) – Random number generator instance. If an integer is passed, seed the numpy default generator with it. Default is to use numpy.random.default_rng().

Return type:

TestResult

resample.permutation.spearmanr(x: ArrayLike, y: ArrayLike, **kwargs: Any) TestResult

Test whether two samples are drawn from same population using rank correlation.

The test statistic is Spearman’s rank correlation coefficient. The test is very sensitive to monotonic relationships between x and y, even if it is very non-linear.

Parameters:
  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • **kwargs – Keyword arguments are forward to same_population().

Return type:

TestResult

resample.permutation.ttest(x: ArrayLike, y: ArrayLike, **kwargs: Any) TestResult

Test whether the means of two samples are compatible with Welch’s t-test.

See https://en.wikipedia.org/wiki/Welch%27s_t-test for details on this test. The p-value computed is for the null hypothesis that the two population means are equal. The test is two-sided, which means that swapping x and y gives the same p-value. Welch’s t-test does not require the sample sizes to be equal and it does not require the samples to have the same variance.

Parameters:
  • x (array-like) – First sample.

  • y (array-like) – Second sample.

  • **kwargs – Keyword arguments are forward to same_population().

Return type:

TestResult

resample.permutation.usp(w: ArrayLike, *, size: int = 9999, method: str = 'auto', random_state: int | Generator | None = None) TestResult

Test independence of two discrete data sets with the U-statistic.

The USP test is described in this paper: https://doi.org/10.1098/rspa.2021.0549. According to the paper, it outperforms the Pearson’s χ² and the G-test in both in stability and power.

It requires that the input is a contigency table (a 2D histogram of value pairs). Whether the original values were discrete or continuous does not matter for the test. In case of continuous values, using a large number of bins is safe, since the test is not negatively affected by bins with zero entries.

Parameters:
  • w (array-like) – Two-dimensional array which represents the counts in a histogram. The counts can be of floating point type, but must have integral values.

  • size (int, optional) – Number of permutations. Default 9999.

  • method (str, optional) – Method used to generate random tables under the null hypothesis. ‘auto’: Use heuristic to select fastest algorithm for given table. ‘boyett’: Boyett’s algorithm, which requires extra space to store N + 1 integers for N entries in total and has O(N) time complexity. It performs poorly when N is large, but does not depend on the number of K table cells. ‘patefield’: Patefield’s algorithm, which does not require extra space and has O(K log(N)) time complexity. It performs well even if N is huge. For small N and large K, the shuffling algorithm is faster. Default is ‘auto’.

  • random_state (numpy.random.Generator or int, optional) – Random number generator instance. If an integer is passed, seed the numpy default generator with it. Default is to use numpy.random.default_rng().

Return type:

TestResult

empirical

Empirical functions.

Empirical functions based on a data sample instead of a parameteric density function, like the empirical CDF. Implemented here are mostly tools used internally.

resample.empirical.cdf_gen(sample: ArrayLike) Callable[[ndarray], ndarray]

Return the empirical distribution function for the given sample.

Parameters:

sample (array-like) – Sample.

Returns:

Empirical distribution function.

Return type:

callable

resample.empirical.influence(fn: Callable[[ArrayLike], ndarray], sample: ArrayLike) ndarray

Calculate the empirical influence function for a given sample and estimator.

Parameters:
  • fn (callable) – Estimator. Can be any mapping ℝⁿ → ℝᵏ, where n is the sample size and k is the length of the output array.

  • sample (array-like) – Sample. Must be one-dimensional.

Returns:

Empirical influence values.

Return type:

ndarray

resample.empirical.quantile_function_gen(sample: ArrayLike) Callable[[float | ArrayLike], float | ndarray]

Return the empirical quantile function for the given sample.

Parameters:

sample (array-like) – Sample.

Returns:

Empirical quantile function.

Return type:

callable