Example 03 – Detector uncertainties

Aims

  • Generate random detector response variations
  • Include detector uncertainties in the response matrix by using a ResponseMatrixArrayBuilder
  • Include statistical uncertainties by generating randomly varied matrices
  • Use array of response matrices in LikelihoodMachine

Instructions

The properties of experimental setups are usually only known to a finite precision. The remaining uncertainty on these properties, and thus the uncertainty on the detector response to true events, must be reflected in the response matrix.

We can create an event-by-event variation of the detector response with the provided script:

$ ../simple_experiment/vary_detector.py ../00/modelA_data.txt modelA_data.txt
$ ../simple_experiment/vary_detector.py ../00/modelB_data.txt modelB_data.txt

It creates 100 randomly generated variations of the simulated detector and varies the simulation that was created in example 00 accordingly. Each variation describes one possible real detector and is often called a “toy variation” or a “universe”.

Events are varied in two ways: The values of reconstructed variables can be different for each toy, and each event can get assigned a weight. The former can be used to cover uncertainties in the detector precision and accuracy, while the latter is a simple way to deal with uncertainties in the overall probability of reconstructing an event.

In this case, the values of the reconstructed reco_x are varied and saved as reco_x_0 to reco_x_99. The detection efficiency is also varied and the ratio of the events nominal efficiency and the efficiency assuming the toy detector stored as weight_0 up to weight_99. Let us plot the different reconstructed distributions we get from the toy variations:

from remu import binning
import numpy as np

with open("../01/reco-binning.yml", 'r') as f:
    reco_binning = binning.yaml.load(f)

# Get real data
reco_binning.fill_from_csv_file("../00/real_data.txt")
data = reco_binning.get_values_as_ndarray()

# Get toys
n_toys = 100
toyA = []
toyB = []
for i in range(n_toys):
    reco_binning.reset()
    reco_binning.fill_from_csv_file("modelA_data.txt",
        weightfield='weight_%i'%(i,), rename={'reco_x_%i'%(i,): 'reco_x'},
        buffer_csv_files=True)
    toyA.append(reco_binning.get_values_as_ndarray())

    reco_binning.reset()
    reco_binning.fill_from_csv_file("modelB_data.txt",
        weightfield='weight_%i'%(i,), rename={'reco_x_%i'%(i,): 'reco_x'},
        buffer_csv_files=True)
    toyB.append(reco_binning.get_values_as_ndarray())

# Plot
figax = None
for A, B in zip(toyA, toyB):
    figax = reco_binning.plot_ndarray(None, A/10.,
        kwargs1d={'alpha':0.05, 'color':'b'}, figax=figax)
    figax = reco_binning.plot_ndarray(None, B/10.,
        kwargs1d={'alpha':0.05, 'color':'r'}, figax=figax)

A = np.mean(toyA, axis=0)
B = np.mean(toyB, axis=0)
figax = reco_binning.plot_ndarray(None, A/10.,
    kwargs1d={'color':'b', 'label':"Model A"}, figax=figax)
figax = reco_binning.plot_ndarray(None, B/10.,
    kwargs1d={'color':'r', 'label':"Model B"}, figax=figax)
figax = reco_binning.plot_ndarray("data.png", data,
    kwargs1d={'color':'k', 'label':"Data"}, sqrt_errors=True, figax=figax)
../../_images/data.png

Here we plot the data with sqrt(N) “error bars” just to give an indication of the expected statistical fluctuations. Also we scaled the model predictions by a factor 10, simply because that makes them correspond roughly to the data. In this example, the systematic uncertainties seem to be of a similar size to the expected statistical variations.

We are using a few specific arguments to the fill_from_csv_file method here:

weightfield
Tells the object which field to use as the weight of the events. Does not need to follow any naming conventions. Here we tell it to use the field corresponding to each toy.
rename
The binning of the response matrix expects a variable called reco_x. This variable is not present in the varied data sets. Here we can specify a dictionary with columns in the data that will be renamed before filling the matrix.
buffer_csv_files
Reading a CSV file from disk and turning it into an array of floating point values takes quite a bit of time. We can speed up the process by buffering the intermediate array on disk. This means the time consuming parsing of a text file only has to happen once per CSV file.

ReMU handles the detector uncertainties by building a response matrix for each toy detector separately. These matrices are then used in parallel to calculate likelihoods. In the end the marginal (i.e. average) likelihood is used as the final answer. Some advanced features of ReMU (like sparse matrices, or nuisance bins) require quite a bit of book-keeping to make the toy matrices behave consistently and as expected. This is handled by the ResponseMatrixArrayBuilder:

import matplotlib.pyplot as plt
from scipy import stats
from remu import migration
from copy import deepcopy

builder = migration.ResponseMatrixArrayBuilder(1)

Its only argument is the number of randomly drawn statistical variations per toy response matrix. The number of simulated events that are used to build the response matrix influences the statistical uncertainty of the response matrix elements. This can be seen as an additional “systematic” uncertainty of the detector response. By generating randomly fluctuated response matrices from the toy matrices, it can be handled organically with the other systematics. In this case we generate 1 randomly varied matrix per toy matrix, yielding a total of 100 matrices in the end.

The toy matrices themselves are built like the nominal matrix in the previous examples and then added to the builder:

with open("../01/reco-binning.yml", 'rt') as f:
    reco_binning = binning.yaml.load(f)
with open("../01/optimised-truth-binning.yml", 'rt') as f:
    truth_binning = binning.yaml.load(f)

resp = migration.ResponseMatrix(reco_binning, truth_binning)

n_toys = 100
for i in range(n_toys):
    resp.reset()
    resp.fill_from_csv_file(["modelA_data.txt", "modelB_data.txt"],
        weightfield='weight_%i'%(i,), rename={'reco_x_%i'%(i,): 'reco_x'},
        buffer_csv_files=True)
    resp.fill_up_truth_from_csv_file(["../00/modelA_truth.txt",
        "../00/modelB_truth.txt"], buffer_csv_files=True)
    builder.add_matrix(resp)

The builder generates the actual array of floating point numbers as soon as the ResponseMatrix object is added. It retains no connection to the object itself.

Now we just need to save the set of response matrices for later use in the likelihood fits:

M = builder.get_random_response_matrices_as_ndarray()
np.save("response_matrix.npy", M)

entries = builder.get_truth_entries_as_ndarray()
np.save("generator_truth.npy", entries)

The LikelihoodMachine is created just like in the previous example:

from six import print_
from matplotlib import pyplot as plt
from remu import likelihood

with open("../01/reco-binning.yml", 'rt') as f:
    reco_binning = binning.yaml.load(f)
with open("../01/optimised-truth-binning.yml", 'rt') as f:
    truth_binning = binning.yaml.load(f)

reco_binning.fill_from_csv_file("../00/real_data.txt")
data = reco_binning.get_entries_as_ndarray()

# No systematics LikelihoodMachine
response_matrix = np.load("../01/response_matrix.npy")
generator_truth = np.load("../01/generator_truth.npy")
lm = likelihood.LikelihoodMachine(data, response_matrix,
    truth_limits=generator_truth, limit_method='prohibit')

# Systematics LikelihoodMachine
response_matrix_syst = np.load("response_matrix.npy")
generator_truth_syst = np.load("generator_truth.npy")
response_matrix_syst.shape = (np.prod(response_matrix_syst.shape[:-2]),) \
    + response_matrix_syst.shape[-2:]
lm_syst = likelihood.LikelihoodMachine(data, response_matrix_syst,
    truth_limits=generator_truth_syst, limit_method='prohibit')

To show the influence of the systematic uncertainties, we create two LikelihoodMachine objects here. One with the average detector response and one with the set of systematically varied responses. The set of response matrices is saved as an A x B x X x Y shaped NumPy array, with the number of systematic toy variations A, the number of randomly drawn matrices per toy B, the number of reco bins X, and the number of truth bins Y. To make handling the matrices easier, we reshape the matrices to an N x X x Y shaped array here.

Now we can test some models against the data, just like in the previous example:

truth_binning.fill_from_csv_file("../00/modelA_truth.txt")
modelA = truth_binning.get_values_as_ndarray()
modelA /= np.sum(modelA)
truth_binning.reset()
truth_binning.fill_from_csv_file("../00/modelB_truth.txt")
modelB = truth_binning.get_values_as_ndarray()
modelB /= np.sum(modelB)

modelA_shape = likelihood.TemplateHypothesis([modelA])
retA = lm.max_log_likelihood(modelA_shape)
print_(retA)
                          L: -28.188288113161803
                        fun: 28.188288113161803
 lowest_optimization_result:       fun: 28.188288113161803
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([ 0.])
  message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     nfev: 22
      nit: 9
   status: 0
  success: True
        x: array([ 935.84655028])
                    message: ['requested number of basinhopping iterations completed successfully']
      minimization_failures: 3
                       nfev: 1758
                        nit: 100
                          x: array([ 935.84655028])
retA_syst = lm_syst.max_log_likelihood(modelA_shape)
print_(retA_syst)
                          L: -28.676388171194702
                        fun: 28.676388171194702
 lowest_optimization_result:       fun: 28.676388171194702
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([ 0.])
  message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     nfev: 10
      nit: 4
   status: 0
  success: True
        x: array([ 941.15273763])
                    message: ['requested number of basinhopping iterations completed successfully']
      minimization_failures: 1
                       nfev: 1462
                        nit: 100
                          x: array([ 941.15273763])
modelB_shape = likelihood.TemplateHypothesis([modelB])
retB = lm.max_log_likelihood(modelB_shape)
print_(retB)
                          L: -24.840381628756134
                        fun: 24.840381628756134
 lowest_optimization_result:       fun: 24.840381628756134
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([  9.23705556e-06])
  message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     nfev: 38
      nit: 11
   status: 0
  success: True
        x: array([ 1042.03782852])
                    message: ['requested number of basinhopping iterations completed successfully']
      minimization_failures: 3
                       nfev: 2156
                        nit: 100
                          x: array([ 1042.03782852])
retB_syst = lm_syst.max_log_likelihood(modelB_shape)
print_(retB_syst)
                          L: -25.701169780679152
                        fun: 25.701169780679152
 lowest_optimization_result:       fun: 25.701169780679152
 hess_inv: <1x1 LbfgsInvHessProduct with dtype=float64>
      jac: array([ -1.42108547e-06])
  message: 'CONVERGENCE: NORM_OF_PROJECTED_GRADIENT_<=_PGTOL'
     nfev: 14
      nit: 6
   status: 0
  success: True
        x: array([ 1039.7300549])
                    message: ['requested number of basinhopping iterations completed successfully']
      minimization_failures: 1
                       nfev: 1644
                        nit: 100
                          x: array([ 1039.7300549])

Let us take another look at how the fitted templates and the data compare in reco space. This time we are going to use ReMU’s built in function to plot the mean and average value of projections of binnings, when given a set of bin contents rather than a single one:

figax = reco_binning.plot_values(None,
    kwargs1d={'color': 'k', 'label': 'data'}, sqrt_errors=True)
modelA_reco = response_matrix.dot(modelA_shape.translate(retA.x))
modelB_reco = response_matrix.dot(modelB_shape.translate(retB.x))
modelA_reco_syst = response_matrix_syst.dot(
    modelA_shape.translate(retA_syst.x))
modelB_reco_syst = response_matrix_syst.dot(
    modelB_shape.translate(retB_syst.x))
reco_binning.plot_ndarray(None, modelA_reco,
    kwargs1d={'color': 'b', 'label': 'model A'},
    error_xoffset=-.1, figax=figax)
reco_binning.plot_ndarray(None, modelA_reco_syst,
    kwargs1d={'color': 'b', 'label': 'model A syst'},
    error_xoffset=-.1, figax=figax)
reco_binning.plot_ndarray(None, modelB_reco,
    kwargs1d={'color': 'r', 'label': 'model B'},
    error_xoffset=+.1, figax=figax)
reco_binning.plot_ndarray("reco-comparison.png", modelB_reco_syst,
    kwargs1d={'color': 'r', 'label': 'model B syst'},
    error_xoffset=+.1, figax=figax)
../../_images/reco-comparison1.png

And of course we can compare the p-values of the two fitted models assuming the nominal detector response:

print_(lm.max_likelihood_p_value(modelA_shape, nproc=4))
print_(lm.max_likelihood_p_value(modelB_shape, nproc=4))
0.084
0.508

With the p-values yielded when taking the systematic uncertainties into account:

print_(lm_syst.max_likelihood_p_value(modelA_shape, nproc=4))
print_(lm_syst.max_likelihood_p_value(modelB_shape, nproc=4))
0.104
0.516

We can construct confidence intervals on the template weights of the models using the maximum likelihood p-value:

p_values_A = []
p_values_B = []
p_values_A_syst = []
p_values_B_syst = []
values = np.linspace(600, 1600, 21)
for v in values:
    fixed_model_A = modelA_shape.fix_parameters((v,))
    fixed_model_B = modelB_shape.fix_parameters((v,))
    A = lm.max_likelihood_p_value(fixed_model_A, nproc=4)
    A_syst = lm_syst.max_likelihood_p_value(fixed_model_A, nproc=4)
    B = lm.max_likelihood_p_value(fixed_model_B, nproc=4)
    B_syst = lm_syst.max_likelihood_p_value(fixed_model_B, nproc=4)
    print_(v, A, A_syst, B, B_syst)
    p_values_A.append(A)
    p_values_B.append(B)
    p_values_A_syst.append(A_syst)
    p_values_B_syst.append(B_syst)

fig, ax = plt.subplots()
ax.set_xlabel("Model weight")
ax.set_ylabel("p-value")
ax.plot(values, p_values_A, label="Model A", color='b', linestyle='dotted')
ax.plot(values, p_values_A_syst, label="Model A syst", color='b',
    linestyle='solid')
ax.plot(values, p_values_B, label="Model B", color='r', linestyle='dotted')
ax.plot(values, p_values_B_syst, label="Model B syst", color='r',
    linestyle='solid')
ax.axvline(retA.x[0], color='b', linestyle='dotted')
ax.axvline(retA_syst.x[0], color='b', linestyle='solid')
ax.axvline(retB.x[0], color='r', linestyle='dotted')
ax.axvline(retB_syst.x[0], color='r', linestyle='solid')
ax.axhline(0.32, color='k', linestyle='dashed')
ax.axhline(0.05, color='k', linestyle='dashed')
ax.legend(loc='best')
fig.savefig("p-values.png")
../../_images/p-values1.png

The “fixed” models have no free parameters here, making them degenerate CompositeHypothesis, equivalent to simple hypotheses. This means using the max_likelihood_p_value() is equivalent to testing the corresponding simple hypotheses using the likelihood_p_value(). No actual maximisation is taking place. Note that the p-values for model A are generally smaller than those for model B. This is consistent with previous results showing that the data is better described by model B.

Finally, let us construct confidence intervals of the template weights, assuming that the corresponding model is correct:

p_values_A = []
p_values_B = []
p_values_A_syst = []
p_values_B_syst = []
values = np.linspace(600, 1600, 21)
for v in values:
    fixed_model_A = modelA_shape.fix_parameters((v,))
    fixed_model_B = modelB_shape.fix_parameters((v,))
    A = lm.max_likelihood_ratio_p_value(fixed_model_A, modelA_shape,
        nproc=4)
    A_syst = lm_syst.max_likelihood_ratio_p_value(fixed_model_A,
        modelA_shape, nproc=4)
    B = lm.max_likelihood_ratio_p_value(fixed_model_B, modelB_shape,
        nproc=4)
    B_syst = lm_syst.max_likelihood_ratio_p_value(fixed_model_B,
        modelB_shape, nproc=4)
    print_(v, A, A_syst, B, B_syst)
    p_values_A.append(A)
    p_values_B.append(B)
    p_values_A_syst.append(A_syst)
    p_values_B_syst.append(B_syst)

fig, ax = plt.subplots()
ax.set_xlabel("Model weight")
ax.set_ylabel("p-value")
ax.plot(values, p_values_A, label="Model A", color='b', linestyle='dotted')
ax.plot(values, p_values_A_syst, label="Model A syst", color='b',
    linestyle='solid')
ax.plot(values, p_values_B, label="Model B", color='r', linestyle='dotted')
ax.plot(values, p_values_B_syst, label="Model B syst", color='r',
    linestyle='solid')
ax.axvline(retA.x[0], color='b', linestyle='dotted')
ax.axvline(retA_syst.x[0], color='b', linestyle='solid')
ax.axvline(retB.x[0], color='r', linestyle='dotted')
ax.axvline(retB_syst.x[0], color='r', linestyle='solid')
ax.axhline(0.32, color='k', linestyle='dashed')
ax.axhline(0.05, color='k', linestyle='dashed')
ax.legend(loc='best')
fig.savefig("ratio-p-values.png")
../../_images/ratio-p-values.png

Note that the p-values for both models go up to 1.0 here (within the granularity of the parameter scan) and that the constructed confidence intervals are very different from the ones before. This is because we are asking a different question now. Before we asked the question “What is the probability of getting a worse likelihood assuming that the tested model-parameter is true?”. Now we ask the question “What is the probability of getting a worse best-fit likelihood ratio, assuming the tested model-parameter is true?”. Since the likelihood ratio is 1.0 at the best fit point and the likelihood ratio of nested hypotheses is less than or equal to 1.0 by construction, the p-value is 100% there.