Binning¶

class remu.binning.Binning(**kwargs)¶

A Binning is a set of disjunct Bins.

Parameters:	phasespace : PhaseSpace The `PhaseSpace` the Binning resides in. bins : list of :class:`Bin`s The list of disjoint bins on that PhaseSpace.

event_in_binning(event)¶: Check whether an event fits into any of the bins.

fill(event, weight=1, raise_error=False, rename={})¶

Fill the events into their respective bins.

Parameters:

event : [iterable of] dict like or Numpy structured array or Pandas DataFrame: The event(s) to be filled into the binning.
weight : float or iterable of floats, optional: The weight of the event(s). Can be either a scalar which is then used for all events or an iterable of weights for the single events. Default: 1.
raise_error : bool, optional: Raise a ValueError if an event is not in the binning. Otherwise ignore the event. Default: False
rename : dict, optional: Dict for translating event variable names to binning variable names. Default: {}, i.e. no translation

fill_from_csv_file(*args, **kwargs)¶

Fill the binning with events from a CSV file.

Parameters:

filename : string or list of strings

The csv file with the data. Can be a list of filenames.

weightfield : string, optional

The column with the event weights.

weight : float or iterable of floats, optional

A single weight that will be applied to all events in the file. Can be an iterable with one weight for each file if filename is a list.

rename : dict, optional

A dict with columns that should be renamed before filling:

{'csv_name': 'binning_name'}

cut_function : function, optional

A function that modifies the loaded data before filling into the binning, e.g.:

cut_function(data) = data[ data['binning_name'] > some_threshold ]

This is done after the optional renaming.

buffer_csv_files : bool, optional

Save the results of loading CSV files in temporary files that can be recovered if the same CSV file is loaded again. This speeds up filling multiple Binnings with the same CSV-files considerably! Default: False

chunksize : int, optional

Load csv file in chunks of <chunksize> rows. This reduces the memory footprint of the loading operation, but can slow it down. Default: 10000

Notes

The file must be formated like this:

first_varname,second_varname,...
<first_value>,<second_value>,...
<first_value>,<second_value>,...
<first_value>,<second_value>,...
...

For example:

x,y,z
1.0,2.1,3.2
4.1,2.0,2.9
3,2,1

All values are interpreted as floats. If weightfield is given, that field will be used as weigts for the event. Other keyword arguments are passed on to the Binning’s fill() method. If filename is a list, all elemets are handled recursively.

classmethod fill_multiple_from_csv_file(binnings, filename, weightfield=None, weight=1.0, rename={}, cut_function=<function <lambda>>, buffer_csv_files=False, chunksize=10000, **kwargs)¶

Fill multiple Binnings from the same csv file(s).

This method saves time, because the numpy array only has to be generated once. Other than the list of binnings to be filled, the (keyword) arguments are identical to the ones used by the instance method fill_from_csv_file().

get_entries_as_ndarray(shape=None, indices=None)¶

Return the number of entries in the bins as ndarray.

Parameters:	shape: tuple of ints Shape of the resulting array. Default: `(len(bins),)` indices: list of ints Only return the given bins. Default: Return all bins.
Returns:	ndarray An ndarray with the numbers of entries of the bins.

get_event_bin(event)¶

Get the bin of the event.

Returns None if the event does not fit in any bin.

Parameters:	event : dict like A dictionary (or similar object) with one value of each variable in the binning, e.g.: {'x': 1.4, 'y': -7.47}
Returns:	Bin or None The `Bin` object the event fits into.

get_event_bin_number(event)¶

Get the bin number of the given event.

Returns None if the event does not belong to any bin.

Parameters:	event : dict like A dictionary (or similar object) with one value of each variable in the binning, e.g.: {'x': 1.4, 'y': -7.47}
Returns:	int or None The bin number

Notes

This is a dumb method that just loops over all bins until it finds a fitting one. It should be replaced with something smarter for more specifig binning classes.

get_sumw2_as_ndarray(shape=None, indices=None)¶

Return the sum of squared weights in the bins as ndarray.

Parameters:	shape: tuple of ints Shape of the resulting array. Default: `(len(bins),)` indices: list of ints Only return the given bins. Default: Return all bins.
Returns:	ndarray An ndarray with the sum of squared weights of the bins.

get_values_as_ndarray(shape=None, indices=None)¶

Return the bin values as ndarray.

Parameters:	shape: tuple of ints Shape of the resulting array. Default: `(len(bins),)` indices: list of ints Only return the given bins. Default: Return all bins.
Returns:	ndarray An ndarray with the values of the bins.

reset(value=0.0, entries=0, sumw2=0.0)¶

Reset all bin values to 0.

Parameters:	value : float, optional Set the bin values to this value. entries : int, optional Set the number of entries in each bin to this value. sumw2 : float, optional Set the sum of squared weights in each bin to this value.

set_entries_from_ndarray(arr)¶: Set the number of bin entries to the values of the ndarray.

set_sumw2_from_ndarray(arr)¶: Set the sums of squared weights to the values of the ndarray.

set_values_from_ndarray(arr)¶: Set the bin values to the values of the ndarray.

Binning¶

Navigation

Related Topics