Binning¶
-
class
remu.binning.
Binning
(**kwargs)¶ A Binning is a set of disjunct Bins.
Parameters: - phasespace : PhaseSpace
The
PhaseSpace
the Binning resides in.- bins : list of :class:`Bin`s
The list of disjoint bins on that PhaseSpace.
-
event_in_binning
(event)¶ Check whether an event fits into any of the bins.
-
fill
(event, weight=1, raise_error=False, rename={})¶ Fill the events into their respective bins.
Parameters: - event : [iterable of] dict like or Numpy structured array or Pandas DataFrame
The event(s) to be filled into the binning.
- weight : float or iterable of floats, optional
The weight of the event(s). Can be either a scalar which is then used for all events or an iterable of weights for the single events. Default: 1.
- raise_error : bool, optional
Raise a ValueError if an event is not in the binning. Otherwise ignore the event. Default: False
- rename : dict, optional
Dict for translating event variable names to binning variable names. Default: {}, i.e. no translation
-
fill_from_csv_file
(*args, **kwargs)¶ Fill the binning with events from a CSV file.
Parameters: - filename : string or list of strings
The csv file with the data. Can be a list of filenames.
- weightfield : string, optional
The column with the event weights.
- weight : float or iterable of floats, optional
A single weight that will be applied to all events in the file. Can be an iterable with one weight for each file if filename is a list.
- rename : dict, optional
A dict with columns that should be renamed before filling:
{'csv_name': 'binning_name'}
- cut_function : function, optional
A function that modifies the loaded data before filling into the binning, e.g.:
cut_function(data) = data[ data['binning_name'] > some_threshold ]
This is done after the optional renaming.
- buffer_csv_files : bool, optional
Save the results of loading CSV files in temporary files that can be recovered if the same CSV file is loaded again. This speeds up filling multiple Binnings with the same CSV-files considerably! Default: False
- chunksize : int, optional
Load csv file in chunks of <chunksize> rows. This reduces the memory footprint of the loading operation, but can slow it down. Default: 10000
Notes
The file must be formated like this:
first_varname,second_varname,... <first_value>,<second_value>,... <first_value>,<second_value>,... <first_value>,<second_value>,... ...
For example:
x,y,z 1.0,2.1,3.2 4.1,2.0,2.9 3,2,1
All values are interpreted as floats. If weightfield is given, that field will be used as weigts for the event. Other keyword arguments are passed on to the Binning’s
fill()
method. If filename is a list, all elemets are handled recursively.
-
classmethod
fill_multiple_from_csv_file
(binnings, filename, weightfield=None, weight=1.0, rename={}, cut_function=<function <lambda>>, buffer_csv_files=False, chunksize=10000, **kwargs)¶ Fill multiple Binnings from the same csv file(s).
This method saves time, because the numpy array only has to be generated once. Other than the list of binnings to be filled, the (keyword) arguments are identical to the ones used by the instance method
fill_from_csv_file()
.
-
get_entries_as_ndarray
(shape=None, indices=None)¶ Return the number of entries in the bins as ndarray.
Parameters: - shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
Returns: - ndarray
An ndarray with the numbers of entries of the bins.
-
get_event_bin
(event)¶ Get the bin of the event.
Returns None if the event does not fit in any bin.
Parameters: - event : dict like
A dictionary (or similar object) with one value of each variable
in the binning, e.g.:
{'x': 1.4, 'y': -7.47}
Returns: - Bin or None
The
Bin
object the event fits into.
-
get_event_bin_number
(event)¶ Get the bin number of the given event.
Returns None if the event does not belong to any bin.
Parameters: - event : dict like
A dictionary (or similar object) with one value of each variable
in the binning, e.g.:
{'x': 1.4, 'y': -7.47}
Returns: - int or None
The bin number
Notes
This is a dumb method that just loops over all bins until it finds a fitting one. It should be replaced with something smarter for more specifig binning classes.
-
get_sumw2_as_ndarray
(shape=None, indices=None)¶ Return the sum of squared weights in the bins as ndarray.
Parameters: - shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
Returns: - ndarray
An ndarray with the sum of squared weights of the bins.
-
get_values_as_ndarray
(shape=None, indices=None)¶ Return the bin values as ndarray.
Parameters: - shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
Returns: - ndarray
An ndarray with the values of the bins.
-
reset
(value=0.0, entries=0, sumw2=0.0)¶ Reset all bin values to 0.
Parameters: - value : float, optional
Set the bin values to this value.
- entries : int, optional
Set the number of entries in each bin to this value.
- sumw2 : float, optional
Set the sum of squared weights in each bin to this value.
-
set_entries_from_ndarray
(arr)¶ Set the number of bin entries to the values of the ndarray.
-
set_sumw2_from_ndarray
(arr)¶ Set the sums of squared weights to the values of the ndarray.
-
set_values_from_ndarray
(arr)¶ Set the bin values to the values of the ndarray.