Example 00 – Basic usage of binnings

Aims

Create “real” and simulated data of the mock experiemnt
Load data into histograms and plot it

Instructions

The folder ../simple_experiment/ contains two scripts to create “real” and simulated data. The script ‘simulate_experiment.py’ simulates the mock experiment and creates two files: one file with the truth information of all simulated events, and another file with the truth and reconstructed information of all reconstructed events. The command line parameters determine the properties of the simulation, e.g. whether to simulate background or signal and what signal model to use.

The script run_experiment.py creates a single file with only reconstructed information. Of course, this file is also the result of simulations, but since it is supposed to represent the real results of a real experiment, no truth information is saved.

Create “real” data corresponding to ten years of running the experiment:

$ ../simple_experiment/run_experiment.py 10 real_data.txt

Create simulated data corresponding to ten times the real data:

$ ../simple_experiment/simulate_experiment.py 100 modelA modelA_data.txt modelA_truth.txt
$ ../simple_experiment/simulate_experiment.py 100 modelB modelB_data.txt modelB_truth.txt

The file reco-binning.yml contains a RectilinearBinning object for the reconstructed information:

!RectilinearBinning
variables:
- reco_x
- reco_y
bin_edges:
- [-.inf,
  ...
  .inf]
- [-.inf,
  ...
  .inf]
include_upper: false

A RectilinearBinning object defines bin edges in multiple variables. These variables are orthogonal to each other. The total number of bins is thus the product of the number of bins per variable.

Let’s create a binning object, load the data into it, and plot the distributions:

from remu import binning
from remu import plotting

with open("reco-binning.yml", 'r') as f:
    reco_binning = binning.yaml.full_load(f)

reco_binning.fill_from_csv_file("real_data.txt")

pltr = plotting.get_plotter(reco_binning)
pltr.plot_values()
pltr.savefig("real_data.png")

reco_binning.reset()
reco_binning.fill_from_csv_file("modelA_data.txt")

pltr = plotting.get_plotter(reco_binning)
pltr.plot_values()
pltr.savefig("modelA_data.png")

reco_binning.reset()
reco_binning.fill_from_csv_file("modelB_data.txt")

pltr = plotting.get_plotter(reco_binning)
pltr.plot_values()
pltr.savefig("modelB_data.png")

Plotting the different kinds of Binning objects is handled by their respective BinningPlotter classes. The function plotting.get_plotter() will return an instance of the appropriate plotting class for the provided binning, in this case a RectilinearBinningPlotter.

The RectilinearBinningPlotter supports the scatter parameter, which makes it draw pseudo scatter plots instead of 2D histograms. This is useful to compare multiple distributions in the same plot:

pltr = plotting.get_plotter(reco_binning)
reco_binning.reset()
reco_binning.fill_from_csv_file("real_data.txt")
pltr.plot_values(label="data", scatter=500)
reco_binning.reset()
reco_binning.fill_from_csv_file("modelA_data.txt")
pltr.plot_values(label="model A", scatter=500)
reco_binning.reset()
reco_binning.fill_from_csv_file("modelB_data.txt")
pltr.plot_values(label="model B", scatter=500)
pltr.legend()
pltr.savefig("compare_data.png")

We can do the same with the true information and its respective binning in ‘truth-binning.yml’:

with open("truth-binning.yml", 'r') as f:
    truth_binning = binning.yaml.full_load(f)

truth_binning.fill_from_csv_file("modelA_truth.txt")

pltr = plotting.get_plotter(truth_binning)
pltr.plot_values()
pltr.savefig("modelA_truth.png")

truth_binning.reset()
truth_binning.fill_from_csv_file("modelB_truth.txt")

pltr = plotting.get_plotter(truth_binning)
pltr.plot_values()
pltr.savefig("modelB_truth.png")

pltr = plotting.get_plotter(truth_binning)
truth_binning.reset()
truth_binning.fill_from_csv_file("modelA_truth.txt")
pltr.plot_values(label="model A", scatter=500)
truth_binning.reset()
truth_binning.fill_from_csv_file("modelB_truth.txt")
pltr.plot_values(label="model B", scatter=500)
pltr.legend()
pltr.savefig("compare_truth.png")