.. _example03: ==================================== Example 03 -- Detector uncertainties ==================================== Aims ==== * Generate random detector response variations * Include detector uncertainties in the response matrix by using a :class:`.ResponseMatrixArrayBuilder` * Include statistical uncertainties by generating randomly varied matrices * Use varied response matrices for fits and hypothesis tests Instructions ============ The properties of experimental setups are usually only known to a finite precision. The remaining uncertainty on these properties, and thus the uncertainty on the detector response to true events, must be reflected in the response matrix. We can create an event-by-event variation of the detector response with the provided script:: $ ../simple_experiment/vary_detector.py ../00/modelA_data.txt modelA_data.txt $ ../simple_experiment/vary_detector.py ../00/modelB_data.txt modelB_data.txt It creates 100 randomly generated variations of the simulated detector and varies the simulation that was created in example 00 accordingly. Each variation describes one possible real detector and is often called a "toy variation" or a "universe". Events are varied in two ways: The values of reconstructed variables can be different for each toy, and each event can get assigned a weight. The former can be used to cover uncertainties in the detector precision and accuracy, while the latter is a simple way to deal with uncertainties in the overall probability of reconstructing an event. In this case, the values of the reconstructed ``reco_x`` are varied and saved as ``reco_x_0`` to ``reco_x_99``. The detection efficiency is also varied and the ratio of the events nominal efficiency and the efficiency assuming the toy detector stored as ``weight_0`` up to ``weight_99``. Let us plot the different reconstructed distributions we get from the toy variations: .. include:: plot_data.py :literal: .. image:: data.png Here we simply plot one histogram for each toy data set with a low alpha value (i.e with high transparency), as well as a solid histogram for the mean values. Also we scaled the model predictions by a factor 10, simply because that makes "experiment running time" of the model predictions correspond to the data. We are using a few specific arguments to the ``fill_from_csv_file`` method: ``weightfield`` Tells the object which field to use as the weight of the events. Does not need to follow any naming conventions. Here we tell it to use the field corresponding to each toy. ``rename`` The binning of the response matrix expects a variable called ``reco_x``. This variable is not present in the varied data sets. Here we can specify a dictionary with columns in the data that will be renamed before filling the matrix. ``buffer_csv_files`` Reading a CSV file from disk and turning it into an array of floating point values takes quite a bit of time. We can speed up the process by buffering the intermediate array on disk. This means the time consuming parsing of a text file only has to happen once per CSV file. ReMU handles the detector uncertainties by building a response matrix for each toy detector separately. These matrices are then used in parallel to calculate likelihoods. In the end, the marginal (i.e. average) likelihood is used as the final answer. Some advanced features of ReMU (like sparse matrices, or nuisance bins) require quite a bit of book-keeping to make the toy matrices behave consistently and as expected. This is handled by the :class:`.ResponseMatrixArrayBuilder`:: from remu import binning from remu import migration builder = migration.ResponseMatrixArrayBuilder(1) Its only argument is the number of randomly drawn statistical variations per toy response matrix. The number of simulated events that are used to build the response matrix influences the statistical uncertainty of the response matrix elements. This can be seen as an additional "systematic" uncertainty of the detector response. By generating randomly fluctuated response matrices from the toy matrices, it can be handled organically with the other systematics. In this case we generate 1 randomly varied matrix per toy matrix, yielding a total of 100 matrices in the end. The toy matrices themselves are built like the nominal matrix in the previous examples and then added to the builder:: with open("../01/reco-binning.yml", 'rt') as f: reco_binning = binning.yaml.full_load(f) with open("../01/optimised-truth-binning.yml", 'rt') as f: truth_binning = binning.yaml.full_load(f) resp = migration.ResponseMatrix(reco_binning, truth_binning) n_toys = 100 for i in range(n_toys): resp.reset() resp.fill_from_csv_file(["modelA_data.txt", "modelB_data.txt"], weightfield='weight_%i'%(i,), rename={'reco_x_%i'%(i,): 'reco_x'}, buffer_csv_files=True) resp.fill_up_truth_from_csv_file( ["../00/modelA_truth.txt", "../00/modelB_truth.txt"], buffer_csv_files=True) builder.add_matrix(resp) .. note:: The toy variations in ``modelA_data.txt`` and ``modelB_data.txt`` must be identical! Otherwise it is not possible to fill the toy matrices with events from both files. It would mix events reconstructed with different detectors. This might create some warnings like this:: UserWarning: Filled-up values are less than the original filling in 1 bins. This should not happen! return self._replace_smaller_truth(new_truth_binning) This occurs when events in a bin with already high efficiency are re-weighted to give the impression of an efficiency greater than 100%. If it does not happen too often, it can be ignored. The builder generates the actual array of floating point numbers as soon as the :class:`.ResponseMatrix` object is added. It retains no connection to the object itself. Now we just need to save the set of response matrices for later use in the likelihood fits:: builder.export("response_matrix.npz") The :class:`.LikelihoodCalculator` is created just like in the previous example:: import numpy as np from matplotlib import pyplot as plt from remu import binning from remu import plotting from remu import likelihood from multiprocess import Pool pool = Pool(8) likelihood.mapper = pool.map with open("../01/reco-binning.yml", 'rt') as f: reco_binning = binning.yaml.full_load(f) with open("../01/optimised-truth-binning.yml", 'rt') as f: truth_binning = binning.yaml.full_load(f) reco_binning.fill_from_csv_file("../00/real_data.txt") data = reco_binning.get_entries_as_ndarray() data_model = likelihood.PoissonData(data) # No systematics LikelihoodCalculator response_matrix = "../01/response_matrix.npz" matrix_predictor = likelihood.ResponseMatrixPredictor(response_matrix) calc = likelihood.LikelihoodCalculator(data_model, matrix_predictor) # Systematics LikelihoodCalculator response_matrix_syst = "response_matrix.npz" matrix_predictor_syst = likelihood.ResponseMatrixPredictor(response_matrix_syst) calc_syst = likelihood.LikelihoodCalculator(data_model, matrix_predictor_syst) To show the influence of the systematic uncertainties, we create two :class:`.LikelihoodCalculator` objects here. One with the non-varied detector response and one with the set of systematically varied responses. Now we can test some models against the data, just like in the previous example:: truth_binning.fill_from_csv_file("../00/modelA_truth.txt") modelA = truth_binning.get_values_as_ndarray() modelA /= np.sum(modelA) truth_binning.reset() truth_binning.fill_from_csv_file("../00/modelB_truth.txt") modelB = truth_binning.get_values_as_ndarray() modelB /= np.sum(modelB) maxi = likelihood.BasinHoppingMaximizer() modelA_shape = likelihood.TemplatePredictor([modelA]) calcA = calc.compose(modelA_shape) retA = maxi(calcA) print(retA) .. include:: modelA_fit.txt :literal: :: calcA_syst = calc_syst.compose(modelA_shape) retA_syst = maxi(calcA_syst) print(retA_syst) .. include:: modelA_fit_syst.txt :literal: :: modelB_shape = likelihood.TemplatePredictor([modelB]) calcB = calc.compose(modelB_shape) retB = maxi(calcB) print(retB) .. include:: modelB_fit.txt :literal: :: calcB_syst = calc_syst.compose(modelB_shape) retB_syst = maxi(calcB_syst) print(retB_syst) .. include:: modelB_fit_syst.txt :literal: Let us take another look at how the fitted templates and the data compare in reco space. This time we are going to use `ReMU`'s built in functions to plot a set of bin contents, i.e. the set of model predictions varied by the detector systematics:: pltr = plotting.get_plotter(reco_binning) pltr.plot_values(edgecolor='C0', label='data', hatch=None, linewidth=2.) modelA_reco, modelA_weights = calcA.predictor(retA.x) modelB_reco, modelB_weights = calcB.predictor(retB.x) modelA_syst_reco, modelA_syst_weights = calcA_syst.predictor(retA.x) modelB_syst_reco, modelB_syst_weights = calcB_syst.predictor(retB.x) pltr.plot_array(modelA_reco, label='model A', edgecolor='C1', hatch=None) pltr.plot_array(modelA_syst_reco, label='model A syst', edgecolor='C1', hatch=r'//', stack_function=0.68) pltr.plot_array(modelB_reco, label='model B', edgecolor='C2', hatch=None) pltr.plot_array(modelB_syst_reco, label='model B syst', edgecolor='C2', hatch=r'\\', stack_function=0.68) pltr.legend() pltr.savefig('reco-comparison.png') .. image:: reco-comparison.png The ``modelX_syst_reco`` arrays now contain 100 different predictions, corresponding to the 100 detector variations. The parameter ``stac_function == 0.68`` tells the plotter to draw the area of the central 68% of the range of predictions. And of course we can compare the p-values of the two fitted models assuming the nominal detector response:: testA = likelihood.HypothesisTester(calcA) testB = likelihood.HypothesisTester(calcB) print(testA.max_likelihood_p_value()) print(testB.max_likelihood_p_value()) .. include:: fit_p-values.txt :literal: As well as the p-values yielded when taking the systematic uncertainties into account:: testA_syst = likelihood.HypothesisTester(calcA_syst) testB_syst = likelihood.HypothesisTester(calcB_syst) print(testA_syst.max_likelihood_p_value()) print(testB_syst.max_likelihood_p_value()) .. include:: fit_p-values_syst.txt :literal: We can construct confidence intervals on the template weights of the models using the maximum likelihood p-value just like before:: p_values_A = [] p_values_B = [] p_values_A_syst = [] p_values_B_syst = [] values = np.linspace(600, 1600, 21) for v in values: A = testA.max_likelihood_p_value([v]) A_syst = testA_syst.max_likelihood_p_value([v]) B = testB.max_likelihood_p_value([v]) B_syst = testB_syst.max_likelihood_p_value([v]) print(v, A, A_syst, B, B_syst) p_values_A.append(A) p_values_B.append(B) p_values_A_syst.append(A_syst) p_values_B_syst.append(B_syst) fig, ax = plt.subplots() ax.set_xlabel("Model weight") ax.set_ylabel("p-value") ax.plot(values, p_values_A, label="Model A", color='C1', linestyle='dotted') ax.plot(values, p_values_A_syst, label="Model A syst", color='C1', linestyle='solid') ax.plot(values, p_values_B, label="Model B", color='C2', linestyle='dotted') ax.plot(values, p_values_B_syst, label="Model B syst", color='C2', linestyle='solid') ax.axvline(retA.x[0], color='C1', linestyle='dotted') ax.axvline(retA_syst.x[0], color='C1', linestyle='solid') ax.axvline(retB.x[0], color='C2', linestyle='dotted') ax.axvline(retB_syst.x[0], color='C2', linestyle='solid') ax.axhline(0.32, color='k', linestyle='dashed') ax.axhline(0.05, color='k', linestyle='dashed') ax.legend(loc='best') fig.savefig("p-values.png") .. image:: p-values.png The "fixed" models have no free parameters here, making them degenerate. This means using the :meth:`.max_likelihood_p_value` is equivalent to testing the corresponding simple hypotheses using the :meth:`.likelihood_p_value`. No actual maximisation is taking place. Note that the p-values for model A are generally smaller than those for model B. This is consistent with previous results showing that the data is better described by model B. Finally, let us construct confidence intervals of the template weights, *assuming that the corresponding model is correct*:: p_values_A = [] p_values_B = [] p_values_A_syst = [] p_values_B_syst = [] values = np.linspace(600, 1600, 21) for v in values: A = testA.max_likelihood_ratio_p_value([v]) A_syst = testA_syst.max_likelihood_ratio_p_value([v]) B = testB.max_likelihood_ratio_p_value([v]) B_syst = testB_syst.max_likelihood_ratio_p_value([v]) print(v, A, A_syst, B, B_syst) p_values_A.append(A) p_values_B.append(B) p_values_A_syst.append(A_syst) p_values_B_syst.append(B_syst) p_values_A_wilks = [] p_values_B_wilks = [] fine_values = np.linspace(600, 1600, 100) for v in fine_values: A = testA_syst.wilks_max_likelihood_ratio_p_value([v]) B = testB_syst.wilks_max_likelihood_ratio_p_value([v]) print(v, A, B) p_values_A_wilks.append(A) p_values_B_wilks.append(B) fig, ax = plt.subplots() ax.set_xlabel("Model weight") ax.set_ylabel("p-value") ax.plot(values, p_values_A, label="Model A", color='C1', linestyle='dotted') ax.plot(values, p_values_A_syst, label="Model A syst", color='C1', linestyle='solid') ax.plot(fine_values, p_values_A_wilks, label="Model A Wilks", color='C1', linestyle='dashed') ax.plot(values, p_values_B, label="Model B", color='C2', linestyle='dotted') ax.plot(values, p_values_B_syst, label="Model B syst", color='C2', linestyle='solid') ax.plot(fine_values, p_values_B_wilks, label="Model B Wilks", color='C2', linestyle='dashed') ax.axvline(retA.x[0], color='C1', linestyle='dotted') ax.axvline(retA_syst.x[0], color='C1', linestyle='solid') ax.axvline(retB.x[0], color='C2', linestyle='dotted') ax.axvline(retB_syst.x[0], color='C2', linestyle='solid') ax.axhline(0.32, color='k', linestyle='dashed') ax.axhline(0.05, color='k', linestyle='dashed') ax.legend(loc='best') fig.savefig("ratio-p-values.png") .. image:: ratio-p-values.png Note that the p-values for both models go up to 1.0 here (within the granularity of the parameter scan) and that the constructed confidence intervals are very different from the ones before. This is because we are asking a different question now. Before we asked the question "What is the probability of getting a worse likelihood assuming that the tested model-parameter is true?". Now we ask the question "What is the probability of getting a worse best-fit likelihood ratio, assuming the tested model-parameter is true?". Since the likelihood ratio is 1.0 at the best fit point and the likelihood ratio of nested hypotheses is less than or equal to 1.0 by construction, the p-value is 100% there. The method :meth:`.wilks_max_likelihood_ratio_p_value` calculates the same p-value as :meth:`.max_likelihood_ratio_p_value`, but does so assuming Wilks' theorem holds. This does not require the generation of random data and subsequent likelihood maximisations, so it is much faster.