CartesianProductBinning
- class remu.binning.CartesianProductBinning(binnings, **kwargs)[source]
Bases:
remu.binning.Binning
A Binning that is the cartesian product of two or more Binnings
- Parameters
- binningslist of Binning
The Binning objects to be multiplied.
Notes
This creates a Binning with as many bins as the product of the number of bins in the iput binnings.
Attributes
binnings
(tuple of Binning) The
Binning
objects that make up the Cartesian product.bins
(proxy for Bins) Proxy that will generate
CartesianProductBin
instances, when accessed.nbins
(int) The number of bins in the binning.
bins_shape
(tuple of int) The sizes of the constituent binnings.
data_size
(int) The number of elements in the data arrays. Might differ from
nbins
due to subbinnings.subbinnings
(dict of {bin_index: Binning}) Subbinnings to replace certain bins.
value_array
(slice of ndarray) A slice of a numpy array, where the values of the bins are stored.
entries_array
(slice of ndarray) A slice of a numpy array, where the number of entries are stored.
sumw2_array
(slice of ndarray) A slice of a numpy array, where the squared weights are stored.
phasespace
(PhaseSpace) The
PhaseSpace
the binning resides in.Methods
clone
(**kwargs)Create a functioning copy of the Binning.
event_in_binning
(event)Check whether an event fits into any of the bins.
fill
(event[, weight, raise_error, rename])Fill the events into their respective bins.
fill_data_index
(i[, weight])Add the weight(s) to the given data position.
fill_from_csv_file
(*args, **kwargs)Fill the binning with events from a CSV file.
fill_multiple_from_csv_file
(binnings, filename)Fill multiple Binnings from the same csv file(s).
from_yaml
(loader, node)Convert a representation node to a Python object.
Return a list of adjacent bin indices.
Return a list of adjacent data indices.
get_bin_data_index
(bin_i)Calculate the data array index from the bin number.
get_bin_index_tuple
(i_bin)Translate the linear bin index of the event to a tuple of single binning bin indices.
get_data_bin_index
(data_i)Calculate the bin number from the data array index.
get_entries_as_ndarray
([shape, indices])Return the number of entries in the bins as ndarray.
get_event_bin
(event)Get the bin of the event.
get_event_bin_index
(event)Get the bin index for a given event.
get_event_data_index
(event)Get the data array index of the given event.
get_event_subbins
(event)Get the tuple of subbins of the event.
get_event_tuple
(event)Get the variable index tuple for a given event.
get_subbins
(data_index)Return a tuple of the bin and subbins corresponding to the data_index.
get_sumw2_as_ndarray
([shape, indices])Return the sum of squared weights in the bins as ndarray.
get_tuple_bin_index
(tup)Translate a tuple of binning specific bin indices to the linear bin index of the event.
get_values_as_ndarray
([shape, indices])Return the bin values as ndarray.
insert_subbinning
(bin_index, binning)Insert a new subbinning into the binning.
insert_subbinning_on_ndarray
(array, ...)Insert values of a new subbinning into the array.
is_dummy
()Return True if there is no data array linked to this binning.
Iterate over all bins and subbins.
Link the data storage arrays into the bins and sub_binnings.
marginalize
(binning_i[, reduction_function])Marginalize out the given binnings and return a new CartesianProductBinning.
marginalize_subbinnings
([bin_indices])Return a clone of the Binning with subbinnings removed.
marginalize_subbinnings_on_ndarray
(array[, ...])Marginalize out the bins corresponding to the subbinnings.
project
(binning_i, **kwargs)Project the binning onto the given binnings and return a new CartesianProductBinning.
reset
([value, entries, sumw2])Reset all bin values to 0.
Set the number of bin entries to the values of the ndarray.
Set the sums of squared weights to the values of the ndarray.
Set the bin values to the values of the ndarray.
to_yaml
(dumper, obj)Convert a Python object to a representation node.
yaml_dumper
yaml_loader
- clone(**kwargs)
Create a functioning copy of the Binning.
Can specify additional kwargs for the initialisation of the new Binning.
- event_in_binning(event)
Check whether an event fits into any of the bins.
- fill(event, weight=1, raise_error=False, rename=None)
Fill the events into their respective bins.
- Parameters
- event[iterable of] dict like or Numpy structured array or Pandas DataFrame
The event(s) to be filled into the binning.
- weightfloat or iterable of floats, optional
The weight of the event(s). Can be either a scalar which is then used for all events or an iterable of weights for the single events. Default: 1.
- raise_errorbool, optional
Raise a ValueError if an event is not in the binning. Otherwise ignore the event. Default: False
- renamedict, optional
Dict for translating event variable names to binning variable names. Default: {}, i.e. no translation
- fill_data_index(i, weight=1.0)
Add the weight(s) to the given data position.
Also increases the number of entries and sum of squared weights accordingly.
- Parameters
- iint
The index of the data arrays to be filled.
- weightfloat or iterable of floats, optional
Weight(s) to be added to the value of the bin.
- fill_from_csv_file(*args, **kwargs)
Fill the binning with events from a CSV file.
- Parameters
- filenamestring or list of strings
The csv file with the data. Can be a list of filenames.
- weightfieldstring, optional
The column with the event weights.
- weightfloat or iterable of floats, optional
A single weight that will be applied to all events in the file. Can be an iterable with one weight for each file if filename is a list.
- renamedict, optional
A dict with columns that should be renamed before filling:
{'csv_name': 'binning_name'}
- cut_functionfunction, optional
A function that modifies the loaded data before filling into the binning, e.g.:
cut_function(data) = data[ data['binning_name'] > some_threshold ]
This is done after the optional renaming.
- buffer_csv_filesbool, optional
Save the results of loading CSV files in temporary files that can be recovered if the same CSV file is loaded again. This speeds up filling multiple Binnings with the same CSV-files considerably! Default: False
- chunksizeint, optional
Load csv file in chunks of <chunksize> rows. This reduces the memory footprint of the loading operation, but can slow it down. Default: 10000
Notes
The file must be formated like this:
first_varname,second_varname,... <first_value>,<second_value>,... <first_value>,<second_value>,... <first_value>,<second_value>,... ...
For example:
x,y,z 1.0,2.1,3.2 4.1,2.0,2.9 3,2,1
All values are interpreted as floats. If weightfield is given, that field will be used as weigts for the event. Other keyword arguments are passed on to the Binning’s
fill()
method. If filename is a list, all elemets are handled recursively.
- classmethod fill_multiple_from_csv_file(binnings, filename, weightfield=None, weight=1.0, rename=None, cut_function=<function Binning.<lambda>>, buffer_csv_files=False, chunksize=10000, **kwargs)
Fill multiple Binnings from the same csv file(s).
This method saves time, because the numpy array only has to be generated once. Other than the list of binnings to be filled, the (keyword) arguments are identical to the ones used by the instance method
fill_from_csv_file()
.
- classmethod from_yaml(loader, node)
Convert a representation node to a Python object.
- get_adjacent_bin_indices()[source]
Return a list of adjacent bin indices.
- Returns
- adjacent_indiceslist of ndarray
The adjacent indices of each bin
- get_adjacent_data_indices()
Return a list of adjacent data indices.
- Returns
- adjacent_indiceslist of ndarray
The adjacent indices of each data index
Notes
Data indices inside a subbinning will only ever be adjacent to other indices inside the same subbinning. There is no information available about which bins in a subbinning are adjacent to which bins in the parent binning.
- get_bin_data_index(bin_i)
Calculate the data array index from the bin number.
- get_bin_index_tuple(i_bin)[source]
Translate the linear bin index of the event to a tuple of single binning bin indices.
Turns this:
i_bin
into this:
(i_x, i_y, i_z)
The order of the indices in the tuple conforms to the order of binnings. The bins are ordered row-major (C-style), i.e. increasing the bin number of the last variable by one increases the overall bin number also by one. The increments of the other variables depend on the number of bins in each variable.
- get_data_bin_index(data_i)
Calculate the bin number from the data array index.
All data indices inside a subbinning will return the bin index of that subbinning.
- get_entries_as_ndarray(shape=None, indices=None)
Return the number of entries in the bins as ndarray.
- Parameters
- shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
- Returns
- ndarray
An ndarray with the numbers of entries of the bins.
- get_event_bin(event)
Get the bin of the event.
Returns None if the event does not fit in any bin.
- Parameters
- eventdict like
A dictionary (or similar object) with one value of each variable
in the binning, e.g.:
{'x': 1.4, 'y': -7.47}
- Returns
- Bin or None
The
Bin
object the event fits into.
- get_event_data_index(event)
Get the data array index of the given event.
Returns None if the event does not belong to any bin.
- Parameters
- eventdict like
A dictionary (or similar object) with one value of each variable in the binning, e.g.:
{'x': 1.4, 'y': -7.47}
- Returns
- int or None
The bin number
See also
- get_event_subbins(event)
Get the tuple of subbins of the event.
Returns None if the event does not fit in any bin.
- Parameters
- eventdict like
A dictionary (or similar object) with one value of each variable
in the binning, e.g.:
{'x': 1.4, 'y': -7.47}
- Returns
- ([bin[, subbin[, subbin …]]) or None
- get_subbins(data_index)
Return a tuple of the bin and subbins corresponding to the data_index.
- Returns
- (bin[, subbin[, subbin …]])
- get_sumw2_as_ndarray(shape=None, indices=None)
Return the sum of squared weights in the bins as ndarray.
- Parameters
- shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
- Returns
- ndarray
An ndarray with the sum of squared weights of the bins.
- get_tuple_bin_index(tup)[source]
Translate a tuple of binning specific bin indices to the linear bin index of the event.
Turns this:
(i_x, i_y, i_z)
into this:
i_bin
The order of the indices in the tuple must conform to the order of binnings. The bins are ordered row-major (C-style), i.e. increasing the bin number of the last binning by one increases the overall bin number also by one. The increments of the other variables depend on the number of bins in each variable.
- get_values_as_ndarray(shape=None, indices=None)
Return the bin values as ndarray.
- Parameters
- shape: tuple of ints
Shape of the resulting array. Default:
(len(bins),)
- indices: list of ints
Only return the given bins. Default: Return all bins.
- Returns
- ndarray
An ndarray with the values of the bins.
- insert_subbinning(bin_index, binning)
Insert a new subbinning into the binning.
- Parameters
- bin_indexint
The bin to be replaced with the subbinning.
- binningBinning
The new subbinning
- Returns
- new_binningBinning
A copy of this binning with the new subbinning.
Warning
This will replace the content of the bin with the content of the new subbinning!
- insert_subbinning_on_ndarray(array, bin_index, insert_array)
Insert values of a new subbinning into the array.
- Parameters
- arrayndarray
The data to work on.
- bin_indexint
The bin to be replaced with the subbinning.
- insert_arrayndarrau
The array to be inserted.
- Returns
- new_arrayndarray
The modified array.
- is_dummy()
Return True if there is no data array linked to this binning.
- iter_subbins()
Iterate over all bins and subbins.
Will yield a tuple of the bins in this Binning and all subbinnings in the order they correspond to the data indices.
- Yields
- (bin[, subbin[, subbin …]])
- link_arrays()
Link the data storage arrays into the bins and sub_binnings.
- marginalize(binning_i, reduction_function=<function sum>)[source]
Marginalize out the given binnings and return a new CartesianProductBinning.
- Parameters
- binning_iiterable of int
Iterable of index of binning to be marginalized.
- reduction_functionfunction
Use this function to marginalize out the entries over the specified variables. Must support the axis keyword argument. Default: numpy.sum
- marginalize_subbinnings(bin_indices=None)
Return a clone of the Binning with subbinnings removed.
- Parameters
- bin_indiceslist of int, optional
The bin indices of the subbinnings to be marginalized. If no indices are specified, all subbinnings are marginalized.
- Returns
- new_binningBinning
- marginalize_subbinnings_on_ndarray(array, bin_indices=None)
Marginalize out the bins corresponding to the subbinnings.
- Parameters
- arrayndarray
The data to work on.
- bin_indiceslist of int, optional
The bin indices of the subbinnings to be marginalized. If no indices are specified, all subbinnings are marginalized.
- Returns
- new_arrayndarray
- project(binning_i, **kwargs)[source]
Project the binning onto the given binnings and return a new CartesianProductBinning.
The order of the original binnings is preserved. If a single
int
is provided, the returned Binning is of the same type as the respective binning.- Parameters
- binning_iiterable of int, or int
Iterable of index of binning to be marginalized.
- kwargsoptional
Additional keyword arguments are passed on to
marginalize()
.
- Returns
- CartesianProductBinning or type(self.binnings[binning_i])
- reset(value=0.0, entries=0, sumw2=0.0)
Reset all bin values to 0.
- Parameters
- valuefloat, optional
Set the bin values to this value.
- entriesint, optional
Set the number of entries in each bin to this value.
- sumw2float, optional
Set the sum of squared weights in each bin to this value.
- set_entries_from_ndarray(arr)
Set the number of bin entries to the values of the ndarray.
- set_sumw2_from_ndarray(arr)
Set the sums of squared weights to the values of the ndarray.
- set_values_from_ndarray(arr)
Set the bin values to the values of the ndarray.
- classmethod to_yaml(dumper, obj)
Convert a Python object to a representation node.
- yaml_loader
alias of
yaml.loader.FullLoader