validphys.closuretest package

Subpackages

Submodules

validphys.closuretest.closure_checks module

closuretest/checks.py

Module containing checks specific to the closure tests.

validphys.closuretest.closure_checks.check_at_least_10_fits(fits)[source]
validphys.closuretest.closure_checks.check_fit_isclosure(fit)[source]

Check the input fit is a closure test

validphys.closuretest.closure_checks.check_fits_areclosures(fits)[source]

Check all fits are closures

validphys.closuretest.closure_checks.check_fits_different_filterseed(fits)[source]

Input fits should have different filter seeds if they are being used for multiple closure test studies, because in high-level hand-waving terms the different level 1 shifts represents different ‘runs of the universe’!

validphys.closuretest.closure_checks.check_fits_have_same_basis(fits_basis)[source]

Check the basis is the same for all fits

validphys.closuretest.closure_checks.check_fits_same_filterseed(fits)[source]

Input fits should have the same filter seed if they are being compared

validphys.closuretest.closure_checks.check_fits_underlying_law_match(fits)[source]

Check that the fits being compared have the same underlying law

validphys.closuretest.closure_checks.check_multifit_replicas(fits_pdf, _internal_max_reps, _internal_min_reps)[source]

Checks that all the fit pdfs have the same number of replicas N_rep. Then check that N_rep is greater than the smallest number of replicas used in actions which subsample the replicas of each fit.

This check also has the secondary effect of filling in the namespace key _internal_max_reps which can be used to override the number of replicas used at the level of the runcard, but by default get filled in as the number of replicas in each fit.

validphys.closuretest.closure_checks.check_t0pdfset_matches_law(t0pdfset, fit)[source]
validphys.closuretest.closure_checks.check_t0pdfset_matches_multiclosure_law(multiclosure_underlyinglaw, t0set)[source]

Checks that, if a multiclosure_underlyinglaw is present, it matches the t0set Checks t0set instead of t0pdfset since different mechanisms can fill t0set

validphys.closuretest.closure_checks.check_use_fitcommondata(use_fitcommondata)[source]

Base check that use_fitcommondata is being used, check should be used with all actions which require comparison to fitcommondata

validphys.closuretest.closure_plots module

closuretest/plots.py

Plots of statistical estimators for single closure test. See multiclosure module for more estimators and plots.

validphys.closuretest.closure_plots.errorbar_figure_from_table(df)[source]

Given a table with even columns as central values as odd columns as errors plot an errorbar plot

validphys.closuretest.closure_plots.plot_delta_chi2(delta_chi2_bootstrap, fits)[source]

Plots distributions of delta chi2 for each fit in fits. Distribution is generated by bootstrapping. For more information on delta chi2 see delta_chi2_bootstrap

validphys.closuretest.closure_results module

closuretest/closure_results.py

Module containing actiosn to calculate sigle closure test estimators. This is useful for quickly checking the bias of a fit without having to run the full multiclosure analysis.

validphys.closuretest.closure_results.delta_chi2_bootstrap(fits_level_1_noise, fits_exps_bootstrap_chi2_central, fits, use_fitcommondata)[source]

Bootstraps delta chi2 for specified fits. Delta chi2 measures whether the level one data is fitted better by the underlying law or the specified fit, it is a measure of overfitting.

delta chi2 = (chi2(T[<f>], D_1) - chi2(T[f_in], D_1))/chi2(T[f_in], D_1)

where T[<f>] is central theory prediction from fit, T[f_in] is theory prediction from t0 pdf (input) and D_1 is level 1 closure data

Exact details on delta chi2 can be found in 1410.8849 eq (28).

validphys.closuretest.closure_results.delta_chi2_table(fits_exps_chi2, fits_exps_level_1_noise, fits_name_with_covmat_label, fits_experiments, fits, use_fitcommondata)[source]

Calculated delta chi2 per experiment and put in table Here delta chi2 is just normalised by ndata and is equal to

delta_chi2 = (chi2(T[<f>], D_1) - chi2(T[f_in], D_1))/ndata

validphys.closuretest.closure_results.fit_underlying_pdfs_summary(fit, fitunderlyinglaw)[source]

Returns a table with a single column for the fit with a row indication the PDF used to generate the data and the t0 pdf

validphys.closuretest.closure_results.summarise_closure_underlying_pdfs(fits_underlying_pdfs_summary)[source]

Collects the underlying pdfs for all fits and concatenates them into a single table

validphys.closuretest.inconsistent_ct module

This module contains the InconsistentCommonData class which is meant to have all the methods needed in order to introduce an inconsistency within a Closure Test.

class validphys.closuretest.inconsistent_ct.InconsistentCommonData(setname: str, ndata: int, commondataproc: str, nkin: int, nsys: int, commondata_table: DataFrame, systype_table: DataFrame, legacy: bool = False, systematics_table: DataFrame = None, legacy_names: Optional[list] = None, kin_variables: Optional[list] = None)[source]

Bases: CommonData

Class that inherits all of the methods of coredata.CommonData class.

This class is meant to have all the methods needed in order to introduce an inconsistency within a Closure Test.

commondata_table: DataFrame
commondataproc: str
export_uncertainties(buffer)[source]

Same as the export_uncertainties method of the CommonData class. The only difference is that systematic_errors is now a property of the class and not a method.

ndata: int
nkin: int
nsys: int
process_commondata(treatment_names, names_uncertainties, sys_rescaling_factor, inconsistent_datasets)[source]

returns a commondata instance with modified systematics. Note that if commondata.setname is not within the inconsistent_datasets or if both ADD and MULT are False, then the commondata object will not be modified.

Parameters
  • treatment_names (list) – list of the names of the treatments that should be rescaled possible values are: MULT, ADD

  • names_uncertainties (list) – list of the names of the uncertainties that should be rescaled possible values are: CORR, UNCORR, THEORYCORR, THEORYUNCORR, SPECIAL SPECIAL is used for intra-dataset systematics

  • sys_rescaling_factor (float, int) –

  • inconsistent_datasets (list) – list of the datasets for which an inconsistency should be introduced

Return type

validphys.inconsistent_ct.InconsistentCommonData

rescale_systematics(treatment_names, names_uncertainties, sys_rescaling_factor)[source]

Rescale the columns of the systematic_errors() that are included in the the names_uncertainties list. And return the rescaled table.

Parameters
  • treatment_names (list) – list of the names of the treatments that should be rescaled possible values are: MULT, ADD

  • names_uncertainties (list) – list of the names of the uncertainties that should be rescaled possible values are: CORR, UNCORR, THEORYCORR, THEORYUNCORR, SPECIAL SPECIAL is used for intra-dataset systematics

  • sys_rescaling_factor (float) – factor by which the systematics should be rescaled

Returns

self.systematics_table

Return type

pd.DataFrame

select_systype_table_indices(treatment_names, names_uncertainties)[source]

Is used to get the indices of the systype_table that correspond to the intersection of the treatment_names and names_uncertainties lists.

Parameters
  • treatment_names (list) – list of the names of the treatments that should be selected possible values are: MULT, ADD

  • names_uncertainties (list) – list of the names of the uncertainties that should be selected possible values are: CORR, UNCORR, THEORYCORR, THEORYUNCORR, SPECIAL SPECIAL is used for intra-dataset systematics

Returns

systype_tab.index

Return type

pd.Index

setname: str
property systematic_errors

Overrides the systematic_errors method of the CommonData class.

This is done in order to allow the systematic_errors to be a property and hence to be able to assign values to it (setter).

systematics_table: DataFrame = None
systype_table: DataFrame

validphys.closuretest.multiclosure module

closuretest/multiclosure.py

Module containing all of the statistical estimators which are averaged across multiple fits or a single replica proxy fit. The actions in this module are used to produce results which are plotted in multiclosure_output.py

class validphys.closuretest.multiclosure.MulticlosureLoader(closure_theories: list, law_theory: ThPredictionsResult, covmat_reps_mean: array)[source]

Bases: object

Stores the basic information for a multiclosure study.

closure_theories

List of validphys.results.ThPredictionsResult objects for each fit.

Type

list

law_theory

ThPredictionsResult object for the underlying law.

Type

validphys.results.ThPredictionsResult

covmat_reps_mean

Covariance matrix of the theory predictions averaged over fits.

Type

np.array

closure_theories: list
covmat_reps_mean: array
law_theory: ThPredictionsResult
class validphys.closuretest.multiclosure.RegularizedMulticlosureLoader(closure_theories: list, law_theory: ThPredictionsResult, covmat_reps_mean: array, pc_basis: array, n_comp: int, reg_covmat_reps_mean: array, sqrt_reg_covmat_reps_mean: array, std_covmat_reps: array)[source]

Bases: MulticlosureLoader

pc_basis

Basis of principal components.

Type

np.array

n_comp

Number of principal components kept after regularisation.

Type

int

reg_covmat_reps_mean

Diagonal, regularised covariance matrix computed from replicas of theory predictions.

Type

np.array

sqrt_reg_covmat_reps_mean

Sqrt of the regularised covariance matrix.

Type

np.array

std_covmat_reps

Square root of diagonal entries of the original covariance matrix.

Type

np.array

n_comp: int
pc_basis: array
reg_covmat_reps_mean: array
sqrt_reg_covmat_reps_mean: array
std_covmat_reps: array
validphys.closuretest.multiclosure.bias_data(regularized_multiclosure_data_loader)[source]

Similar to bias_dataset but for all data.

validphys.closuretest.multiclosure.bias_dataset(regularized_multiclosure_dataset_loader)[source]

Computes the normalized bias for a RegularizedMulticlosureLoader object for a single dataset.

Parameters

regularized_multiclosure_dataset_loader (RegularizedMulticlosureLoader) –

Returns

bias_fits n_comp

Return type

tuple

validphys.closuretest.multiclosure.compute_normalized_bias(regularized_multiclosure_loader: RegularizedMulticlosureLoader, corrmat: bool = False) array[source]

Compute the normalized bias for a RegularizedMulticlosureLoader object. If corrmat is True, the bias is computed assuming that RegularizedMulticlosureLoader contains the correlation matrix, this is needed when computing the bias for the entire data.

Parameters
Returns

Array of shape len(fits) containing the normalized bias for each fit.

Return type

np.array

validphys.closuretest.multiclosure.eigendecomposition(covmat: array) tuple[source]

Computes the eigendecomposition of a covariance matrix and returns the eigenvalues, eigenvectors and the normalized eigenvalues ordered from largest to smallest.

Parameters

covmat (np.array) – covariance matrix

Returns

3D tuple containing the eigenvalues, eigenvectors and the normalized eigenvalues. Note that the eigenvalues are sorted from largest to smallest.

Return type

tuple

validphys.closuretest.multiclosure.fits_normed_dataset_central_delta(multiclosure_dataset_loader, _internal_max_reps=None, _internal_min_reps=20)[source]

For each fit calculate the difference between central expectation value and true val. Normalize this value by the variance of the differences between replicas and central expectation value (different for each fit but expected to vary only a little). Each observable central exp value is expected to be gaussianly distributed around the true value set by the fakepdf.

Parameters
  • multiclosure_dataset_loader (tuple) – closure fits theory predictions, underlying law theory predictions, covariance matrix, sqrt covariance matrix

  • _internal_max_reps (int) – maximum number of replicas to use for each fit

  • _internal_min_reps (int) – minimum number of replicas to use for each fit

Returns

deltas – 2-D array with shape (n_fits, n_obs)

Return type

np.array

validphys.closuretest.multiclosure.mean_covmat_multiclosure(closure_theories: list) array[source]

Computes the ‘PDF’ covariance matrices obtained from each multiclosure fit and averages over them.

Parameters

closure_theories (list) – list of ThPredictionsResult

Returns

np.array

Return type

covmat_reps_mean

validphys.closuretest.multiclosure.multiclosure_data_loader(data: DataGroupSpec, fits_pdf: list, multiclosure_underlyinglaw: PDF, t0set: PDF) MulticlosureLoader[source]

Like multiclosure_dataset_loader except for all data

validphys.closuretest.multiclosure.multiclosure_dataset_loader(dataset: DataSetSpec, fits_pdf: list, multiclosure_underlyinglaw: PDF, t0set: PDF) MulticlosureLoader[source]

Internal function for loading multiple theory predictions and underlying law for a given dataset. This function is used to avoid memory issues when caching the load function of a group of datasets.

Parameters
  • dataset ((DataSetSpec, DataGroupSpec)) – dataset for which the theory predictions and t0 covariance matrix will be loaded. Note that due to the structure of validphys this function can be overloaded to accept a DataGroupSpec.

  • fits_pdf (list) – list of PDF objects produced from performing multiple closure tests fits. Each fit should have a different filterseed but the same underlying law used to generate the pseudodata.

  • multiclosure_underlyinglaw (PDF) – PDF used to generate the pseudodata which the closure tests fitted. This is inferred from the fit runcards.

  • t0set (validphys.core.PDF) – t0 pdfset, is only used to check that the underlying law matches the t0set.

Returns

A dataclass storing the theory predictions for the fits and the underlying law.

Return type

MulticlosureLoader

Notes

This function replicates behaviour found elsewhere in validphys, the reason for this is that due to the default caching behaviour one can run into memory issues when loading the theory predictions for the amount of fits typically used in these studies.

validphys.closuretest.multiclosure.normalized_delta_bias_data(regularized_multiclosure_data_loader: RegularizedMulticlosureLoader) tuple[source]

Compute for all data only the normalized delta after PCA regularization.

Parameters

regularized_multiclosure_data_loader (tuple) – Tuple containing the results of multiclosure fits after pca regularization

Returns

deltas n_comp

Return type

tuple

validphys.closuretest.multiclosure.regularized_multiclosure_data_loader(multiclosure_data_loader: MulticlosureLoader, explained_variance_ratio=0.95, _internal_max_reps=None, _internal_min_reps=20)[source]

Similar to multiclosure.regularized_multiclosure_dataset_loader except for all data. In this case we regularize the correlation matrix rather than the covariance matrix, the reason for this is that different experiments can have different units.

Parameters
  • multiclosure_data_loader (MulticlosureLoader) –

  • explained_variance_ratio (float, default is 0.95) –

  • _internal_max_reps (int, default is None) – Maximum number of replicas used in the fits this is needed to check that the number of replicas is the same for all fits

  • _internal_min_reps (int, default is 20) – Minimum number of replicas used in the fits this is needed to check that the number of replicas is the same for all fits

Return type

RegularizedMulticlosureLoader

validphys.closuretest.multiclosure.regularized_multiclosure_dataset_loader(multiclosure_dataset_loader: MulticlosureLoader, explained_variance_ratio=0.95, _internal_max_reps=None, _internal_min_reps=20) RegularizedMulticlosureLoader[source]

Similar to multiclosure.multiclosure_dataset_loader but computes the regularized PDF covariance matrix by only keeping the largest eigenvalues that sum to the explained_variance_ratio.

Parameters
  • multiclosure_dataset_loader (MulticlosureLoader) –

  • explained_variance_ratio (float, default is 0.95) –

  • _internal_max_reps (int, default is None) – Maximum number of replicas used in the fits this is needed to check that the number of replicas is the same for all fits

  • _internal_min_reps (int, default is 20) – Minimum number of replicas used in the fits this is needed to check that the number of replicas is the same for all fits

Return type

RegularizedMulticlosureLoader

validphys.closuretest.multiclosure.xq2_dataset_map(xq2map_with_cuts, multiclosure_dataset_loader, _internal_max_reps=None, _internal_min_reps=20)[source]

For a single dataset and a set of fits define a dictionary which contains for each datapoint of the dataset the following information: - x coordinate - Q**2 coordinate - value of Ratio bias-variance at that point for the given fits - value of xi at that point for the given fits

for double Parameters ———- xq2map_with_cuts: validphys.kinematics.XQ2Map

contains kinematic information of dataset’s datapoints

multiclosure_dataset_loader: tuple

closure fits theory predictions, underlying law theory predictions, covariance matrix, sqrt covariance matrix

_internal_max_reps: int

maximum number of replicas to use for each fit

_internal_min_reps: int

minimum number of replicas to use for each fit

xq2map: dictionary

dictionary containing: - x coordinate - Q**2 coordinate - Ratio bias-variance - xi

validphys.closuretest.multiclosure_bootstrap module

Module for bootstrapping multiclosure fits.

class validphys.closuretest.multiclosure_bootstrap.BootstrappedTheoryResult(data)[source]

Bases: object

Proxy class which mimics results.ThPredictionsResult so that pre-existing bias/variance actions can be used with bootstrapped replicas

validphys.closuretest.multiclosure_bootstrap.bootstrapped_bias_data(bootstrapped_regularized_multiclosure_data_loader)[source]

Computes Bias and Variance for each bootstrap sample. Returns a DataFrame with the results.

validphys.closuretest.multiclosure_bootstrap.bootstrapped_bias_dataset(bootstrapped_regularized_multiclosure_dataset_loader, dataset)[source]

Computes Bias for each bootstrap sample. Returns a DataFrame with the results.

validphys.closuretest.multiclosure_bootstrap.bootstrapped_indicator_function_data(bootstrapped_normalized_delta_bias_data, nsigma=1)[source]

Compute the indicator function for each bootstrap sample.

Parameters
  • bootstrapped_normalized_delta_bias_data (list) – list containing the normalized deltas and the number of principal components.

  • nsigma (int, default is 1) –

Returns

list

list of length N_boot and entrances are arrays of dim Npca x Nfits containing the indicator function for each bootstrap sample.

float

average number of degrees of freedom

Return type

2-D tuple

validphys.closuretest.multiclosure_bootstrap.bootstrapped_multiclosure_data_loader(multiclosure_data_loader: MulticlosureLoader, n_fit_max: int, n_fit: int, n_rep_max: int, n_rep: int, n_boot_multiclosure: int, use_repeats: bool = True)[source]

Like bootstrapped_multiclosure_dataset_loader except for all data.

validphys.closuretest.multiclosure_bootstrap.bootstrapped_multiclosure_dataset_loader(multiclosure_dataset_loader: MulticlosureLoader, n_fit_max: int, n_fit: int, n_rep_max: int, n_rep: int, n_boot_multiclosure: int, use_repeats: bool = True)[source]

Returns a tuple of MulticlosureLoader objects each of which is a bootstrap resample of the original dataset.

Parameters
  • multiclosure_dataset_loader (MulticlosureLoader) –

  • n_fit_max (int) – maximum number of fits, should be smaller or equal to number of multiclosure fits

  • n_fit (int) – number of fits to draw for each resample

  • n_rep_max (int) – maximum number of replicas, should be smaller or equal to number of replicas in each fit

  • n_rep (int) – number of replicas to draw for each resample

  • n_boot_multiclosure (int) – number of bootstrap resamples to perform

  • rng_seed_mct_boot (int) – seed for random number generator

  • use_repeats (bool, default is True) – whether to allow repeated fits and replicas in each resample

Returns

resampled_multiclosure – tuple of MulticlosureLoader objects each of which is a bootstrap resample of the original dataset

Return type

tuple of shape (n_boot_multiclosure,)

validphys.closuretest.multiclosure_bootstrap.bootstrapped_normalized_delta_bias_data(bootstrapped_regularized_multiclosure_data_loader)[source]

Compute the normalized deltas for each bootstrap sample. Note: delta is the bias in the diagonal basis.

Parameters

bootstrapped_regularized_multiclosure_data_loader (list) – list of RegularizedMulticlosureLoader objects.

Return type

list

validphys.closuretest.multiclosure_bootstrap.bootstrapped_regularized_multiclosure_data_loader(multiclosure_data_loader: MulticlosureLoader, n_fit_max: int, n_fit: int, n_rep_max: int, n_rep: int, n_boot_multiclosure: int, use_repeats: bool = True, explained_variance_ratio: float = 0.95, _internal_max_reps=None, _internal_min_reps=20) tuple[source]

Same as bootstrapped_regularized_multiclosure_dataset_loader but for all the data.

validphys.closuretest.multiclosure_bootstrap.bootstrapped_regularized_multiclosure_dataset_loader(multiclosure_dataset_loader: MulticlosureLoader, n_fit_max: int, n_fit: int, n_rep_max: int, n_rep: int, n_boot_multiclosure: int, use_repeats: bool = True, explained_variance_ratio: float = 0.95, _internal_max_reps=None, _internal_min_reps=20) tuple[source]

Similar to multiclosure.bootstrapped_multiclosure_dataset_loader but returns PCA regularised covariance matrix, where the covariance matrix has been computed from the replicas of the theory predictions.

Returns a tuple of RegularizedMulticlosureLoader objects.

validphys.closuretest.multiclosure_bootstrap.standard_indicator_function(standard_variable, nsigma=1)[source]

Calculate the indicator function for a standardised variable.

Parameters
  • standard_variable (np.array) – array of variables that have been standardised: (x - mu)/sigma

  • nsigma (float) – number of standard deviations to consider

Returns

array of ones and zeros. If 1 then the variable is within nsigma standard deviations from the mean, otherwise it is 0.

Return type

np.array

validphys.closuretest.multiclosure_inconsistent_output module

multiclosure_inconsistent_output

Module containing the actions which produce some output in validphys reports i.e figures or tables for (inconsistent) multiclosure estimators in the space of data

validphys.closuretest.multiclosure_inconsistent_output.lambdavalues_bootstrapped_table_bias_datasets = <reportengine.resourcebuilder.collect object>

Collects bootstrapped_table_bias_data over multiple lambda values dataspecs.

validphys.closuretest.multiclosure_inconsistent_output.plot_l2_condition_number(each_dataset, internal_multiclosure_data_collected_loader, evr_min=0.9, evr_max=0.995, evr_n=20)[source]

Plot the L2 condition number of the covariance matrix as a function of the explained variance ratio. The plot gives an idea of the stability of the covariance matrix as a function of the exaplained variance ratio and hence the number of principal components used to reduce the dimensionality.

The ideal explained variance ratio is chosen based on a threshold L2 condition number, in general this threshold number (and the derived explained variance ratio) should be chosen so that

relative error in output (inverse covmat) <= relative error in input (covmat) * condition number Note that in a closure test the relative error in the covariance matrix is very small and only numerical.

Parameters
  • each_dataset (list) – List of datasets

  • multiclosure_data_loader (list) – list of multiclosure_dataset_loader objects

Yields

fig – Figure object

validphys.closuretest.multiclosure_inconsistent_output.plot_lambdavalues_bias_values(lambdavalues_bootstrapped_table_bias_datasets, lambdavalues, each_dataset)[source]

Plot sqrt of bias and its bootstrap uncertainty as a function of lambda for each dataset.

Parameters
  • lambdavalues_bootstrapped_table_bias_datasets (list) – list of data frames computed as per table_bias_datasets.

  • lambdavalues (list) – list specified in multiclosure_analysis.yaml

  • each_dataset (list) – list of datasets

Yields

figure

validphys.closuretest.multiclosure_inconsistent_output.plot_lambdavalues_bias_values_full_data(lambdavalues_bootstrapped_table_bias_data, lambdavalues)[source]

Plot sqrt of bias and its bootstrap uncertainty as a function of lambda for the full dataset.

Parameters
  • lambdavalues_bootstrapped_table_bias_data (list) – list of data frames computed as per table_bias_data.

  • lambdavalues (list) – list specified in multiclosure_analysis.yaml

Return type

figure

validphys.closuretest.multiclosure_nsigma module

This module contains the functions used in Sec. 4 of paper: arXiv: 2503.17447

set_1, set_2, and set_3 correspond to (S_1), (S_2), and (S_3) in Eq. 4.3, 4.6, and 4.7.

class validphys.closuretest.multiclosure_nsigma.MulticlosureNsigma(nsigma_table: DataFrame, is_weighted: bool)[source]

Bases: object

Dataclass containing nsigma values for all datasets and fits, also used to keep track on whether the multiclosure fit is weighted or not.

nsigma_table

A table containing n_sigma values.

Type

pd.DataFrame

is_weighted

Whether the fit was weighted.

Type

bool

is_weighted: bool
nsigma_table: DataFrame
class validphys.closuretest.multiclosure_nsigma.NsigmaAlpha(alpha_dict: dict, is_weighted: bool)[source]

Bases: object

Dataclass storing the set 1 values (can be used both for the set 1 and its complement).

alpha_dict

A dictionary containing the set 1 alpha values.

Type

dict

is_weighted

Whether the fit was weighted.

Type

bool

alpha_dict: dict
is_weighted: bool
validphys.closuretest.multiclosure_nsigma.Z_ALPHA_RANGE = array([       inf, 2.57235211, 2.32257453, 2.16610675, 2.04959427,        1.95566144, 1.87635856, 1.8073542 , 1.74601652, 1.69062163,        1.63997627, 1.59321882, 1.5497059 , 1.50894386, 1.47054524,        1.43420016, 1.39965665, 1.36670697, 1.33517774, 1.30492264,        1.27581704, 1.24775386, 1.22064035, 1.19439566, 1.16894884,        1.14423727, 1.12020535, 1.09680356, 1.0739875 , 1.05171725,        1.02995676, 1.00867336, 0.98783733, 0.96742157, 0.94740127,        0.92775369, 0.90845787, 0.88949451, 0.87084575, 0.85249503,        0.83442701, 0.81662736, 0.79908276, 0.78178075, 0.76470967,        0.74785859, 0.73121725, 0.71477599, 0.69852571, 0.68245784,        0.66656426, 0.65083731, 0.63526971, 0.61985457, 0.60458535,        0.5894558 , 0.57445999, 0.55959227, 0.54484724, 0.53021973,        0.51570479, 0.50129771, 0.48699394, 0.47278912, 0.45867907,        0.44465976, 0.4307273 , 0.41687796, 0.40310812, 0.3894143 ,        0.37579311, 0.3622413 , 0.3487557 , 0.33533322, 0.32197089,        0.30866581, 0.29541514, 0.28221615, 0.26906614, 0.25596249,        0.24290266, 0.22988412, 0.21690443, 0.20396118, 0.19105201,        0.1781746 , 0.16532667, 0.15250597, 0.1397103 , 0.12693746,        0.11418529, 0.10145167, 0.08873448, 0.07603162, 0.06334102,        0.05066062, 0.03798835, 0.02532218, 0.01266008, 0.        ])

Quantile range for computing the true positive rate and true negative rate.

validphys.closuretest.multiclosure_nsigma.comp_nsigma_alpha(multiclosurefits_nsigma: DataFrame, weighted_dataset: str) NsigmaAlpha[source]

Computes the complement set 1 alpha values.

validphys.closuretest.multiclosure_nsigma.comp_set_1(dataspecs_comp_nsigma_alpha: list) dict[source]

Returns the complement set 1 alpha values.

validphys.closuretest.multiclosure_nsigma.dataspecs_comp_nsigma_alpha = <reportengine.resourcebuilder.collect object>

Collect complement set 1 alpha over dataspecs.

validphys.closuretest.multiclosure_nsigma.dataspecs_multiclosurefits_nsigma = <reportengine.resourcebuilder.collect object>

Collect the multiclosurefits_nsigma over dataspecs.

validphys.closuretest.multiclosure_nsigma.dataspecs_nsigma_alpha = <reportengine.resourcebuilder.collect object>

Collect set 1 alpha over dataspecs.

validphys.closuretest.multiclosure_nsigma.def_of_nsigma_alpha(multiclosurefits_nsigma: DataFrame, weighted_dataset: str, complement: bool = False) NsigmaAlpha[source]

Defines how the set 1 alpha values are computed. It allows to compute both the set 1 and its complement.

Parameters
  • multiclosurefits_nsigma (pd.DataFrame) – The nsigma table.

  • weighted_dataset (str) – The name of the weighted dataset.

  • complement (bool, default=False) – Whether to compute the complement set 1 alpha values.

Return type

NsigmaAlpha

validphys.closuretest.multiclosure_nsigma.def_set_3(dataspecs_multiclosurefits_nsigma: list, weighted_dataset: str, complement: bool = False) dict[source]

Defines how the set 3 values are computed. It allows to compute both the set 3 and its complement.

Parameters
  • dataspecs_multiclosurefits_nsigma (list) – List of MulticlosureNsigma dataclasses.

  • weighted_dataset (str) – The name of the weighted dataset.

  • complement (bool, default=False) – Whether to compute the complement set 3 alpha values.

Return type

dict

validphys.closuretest.multiclosure_nsigma.multiclosurefits_nsigma(fits: NSList, fits_data: list, fits_datasets_chi2_nsigma_deviation: list, is_weighted: bool) MulticlosureNsigma[source]

Returns a table (dataframe) containing n_sigma values. Index: dataset names, Columns: Level 1 seeds (filterseed).

Parameters
  • fits (NSList) – List of fits.

  • fits_data (list) – List of data for each fit.

  • fits_datasets_chi2_nsigma_deviation (list) – List of n_sigma values for each dataset for each fit.

  • is_weighted (bool) – Used to keep track of whether the fit was weighted.

Return type

MulticlosureNsigma

validphys.closuretest.multiclosure_nsigma.nsigma_alpha(multiclosurefits_nsigma: DataFrame, weighted_dataset: str) NsigmaAlpha[source]

Computes the set 1 alpha values.

validphys.closuretest.multiclosure_nsigma.probability_inconsistent(set_1, set_2, set_3, comp_set_1, n_fits, weighted_dataset)[source]

The set of inconsistent fits can be defined in different ways, two possible cases are:

  1. C_2: (S_1 intersect S_2) union (S_3)

  2. C_3 = S_1 union (~S_1 intersect S_3)

The probability of a dataset being inconsistent is defined as:

P(inconsistent) = |I_alpha| / N

where N is the total number of fits.

validphys.closuretest.multiclosure_nsigma.set_1(dataspecs_nsigma_alpha: list) dict[source]

Returns the set 1 alpha values, these are defined as

S_1 = {j | n_{sigma}^{j} > Z_{alpha}}

where j is the index of the fit and n_{sigma}^{j} is the n-sigma value computed for fit j.

Parameters

dataspecs_nsigma_alpha (list) – List of NsigmaAlpha dataclasses.

Return type

dict

validphys.closuretest.multiclosure_nsigma.set_2(dataspecs_nsigma_alpha: list) dict[source]

Same as the set 1 alpha values, but for the weighted fits.

S_2 = {i | n_{weighted, sigma}^{i} > Z_{alpha}}

where i is the index of the fit and n_{weighted, sigma}^{i} is the n-sigma value computed on the weighted dataset for fit i.

Parameters

dataspecs_nsigma_alpha (list) – List of NsigmaAlpha dataclasses.

Return type

dict

validphys.closuretest.multiclosure_nsigma.set_3(dataspecs_multiclosurefits_nsigma: list, weighted_dataset: str) dict[source]

Computes the set 3 alpha values. The set 3 is defined as:

S_3 = {i | n_{weighted, sigma}^{i} - n_{ref, sigma}^{i}> + Z_{alpha}}

where the n-sigma is computed on all datasets that are not the weighted dataset. Moreover if for a fit i any dataset has a n-sigma value greater than Z_{alpha}, then the fit i is included in the set.

validphys.closuretest.multiclosure_nsigma_helpers module

This module contains some helper functions that are used for the computation of nsigma in the context of a multi-closure test.

class validphys.closuretest.multiclosure_nsigma_helpers.CentralChi2Data(value: float, ndata: int, dataset: validphys.core.DataSetSpec)[source]

Bases: object

dataset: DataSetSpec
ndata: int
property reduced
value: float
validphys.closuretest.multiclosure_nsigma_helpers.central_member_chi2(central_predictions: DataFrame, sqrt_covmat: ndarray, dataset: DataSetSpec, loaded_commondata_with_cuts: CommonData) CentralChi2Data[source]

Computes the chi2 value for a dataset.

Parameters
  • central_predictions – The central predictions for the dataset.

  • sqrt_covmat (np.ndarray) – The square root of the covariance matrix.

  • dataset (DataSetSpec) – The dataset.

  • loaded_commondata_with_cuts (nnpdf_data.coredata.CommonData) –

Return type

CentralChi2Data

validphys.closuretest.multiclosure_nsigma_helpers.chi2_nsigma_deviation(central_member_chi2: CentralChi2Data) float[source]

Computes n_sigma as: (chi2 - ndata) / sqrt(2 * ndata)

Parameters

central_member_chi2 (CentralChi2Data) –

Returns

The deviation in units of sigma.

Return type

float

validphys.closuretest.multiclosure_nsigma_helpers.datasets_chi2_nsigma_deviation = <reportengine.resourcebuilder.collect object>

Collect the n_sigma values over list of dataset_input.

validphys.closuretest.multiclosure_nsigma_helpers.fits_data = <reportengine.resourcebuilder.collect object>

Collects the data for each fit.

validphys.closuretest.multiclosure_nsigma_helpers.fits_datasets_chi2_nsigma_deviation = <reportengine.resourcebuilder.collect object>

Collects over fits and for all datasets the n_sigma values.

validphys.closuretest.multiclosure_nsigma_helpers.is_weighted(fits_data: list) bool[source]

Returns whether the considered multiclosure tests has been weighted or not. If the weighted datasets are not the same for all fits, or there is more than one weighted dataset, an error is raised.

Parameters

fits_data (list) – List of data for each fit.

Return type

bool

validphys.closuretest.multiclosure_nsigma_helpers.n_fits(dataspecs)[source]

Computes the total number of fits in the multiclosure test. If the number of fits is not the same across dataspecs it raises an error.

validphys.closuretest.multiclosure_nsigma_output module

Module for plotting the results of the multiclosure_nsigma.py script.

Can be used to reproduce the plots in Sec. 4 of arXiv: 2503.17447

validphys.closuretest.multiclosure_nsigma_output.plot_1_minus_all_sets(set_1, set_3, set_2, n_fits)[source]

Plots complement of S_1, S_2 and S_3.

validphys.closuretest.multiclosure_nsigma_output.plot_all_sets(set_1, set_3, set_2, n_fits)[source]

Plots S_1, S_2 and S_3.

validphys.closuretest.multiclosure_nsigma_output.plot_probability_consistent(probability_inconsistent, comp_set_1, weighted_dataset, n_fits)[source]

Plots the probability of dataset being flagged as consistent.

validphys.closuretest.multiclosure_nsigma_output.plot_probability_inconsistent(probability_inconsistent, set_1, weighted_dataset, n_fits)[source]

The set of inconsistent fits:

  1. C_1 = S_1

  2. C_2 = (S_1 intersect S_3) union (S_2)

  3. C_3 = S_1 union (~S_1 intersect S_3)

The probability of a dataset being inconsistent is defined as:

P(inconsistent) = |I_alpha| / N

where N is the total number of fits.

validphys.closuretest.multiclosure_output module

multiclosure_output

Module containing the actions which produce some output in validphys reports i.e figures or tables for multiclosure estimators in the space of data.

validphys.closuretest.multiclosure_output.bootstrapped_table_bias_data(bootstrapped_bias_data)[source]

Compute the bias, sqrt bias and their bootstrap errors for a DataGroup and return a DataFrame with the results.

validphys.closuretest.multiclosure_output.bootstrapped_table_bias_datasets(bootstrapped_bias_datasets)[source]

Compute the bias, variance, ratio and sqrt(ratio) for each dataset and return a DataFrame with the results. Uncertainty on ratio and sqrt ratio is computed by Gaussian error propagation of the bootstrap uncertainty on bias and variance.

validphys.closuretest.multiclosure_output.plot_xq2_data_prcs_maps(xq2_data_map, each_dataset)[source]

Heat map of the ratio bias variance and xi quantile estimator for each datapoint in each dataset.

Parameters
  • xq2_data_map (dictionary) –

  • containing (dictionary) –

    • x coordinate

    • Q**2 coordinate

    • Ratio bias-variance

    • xi

  • each_dataset (list) –

Yields

figure

validphys.closuretest.multiclosure_output.table_bias_data(bias_data)[source]

Same as table_bias_datasets but for all the data, meaning that the correlations between the datasets are taken into account.

Parameters

bias_data (list) – Same of bias_dataset but for all the data

Returns

DataFrame containing the bias, variance, ratio and sqrt(ratio) for each dataset

Return type

pd.DataFrame

validphys.closuretest.multiclosure_output.table_bias_datasets(bias_datasets, each_dataset)[source]

Compute the bias and sqrt bias and associated errors for each dataset and return a DataFrame with the results.

Parameters
  • bias_datasets (list) – List of tuples containing the values of bias for each dataset.

  • each_dataset (list) – List of validphys.core.DataSetSpec

Returns

DataFrame containing the bias, variance, ratio and sqrt(ratio) for each dataset

Return type

pd.DataFrame

validphys.closuretest.multiclosure_output.table_xi_indicator_function_data(bootstrapped_indicator_function_data)[source]

Computes the bootstrap average and std of the indicator function for the data.

Parameters

bootstrapped_indicator_function_data (tuple) –

Returns

DataFrame containing the average and std of the indicator function for the data.

Return type

pd.DataFrame

validphys.closuretest.multiclosure_output.xi_delta_histogram(normalized_delta_bias_data, title, lambda_value, label_hist=None)[source]

Plot histogram of normalized delta regularized with PCA.

Parameters
  • normalized_delta_bias_data (tuple) –

  • label_hist (str) – summary description of multiclosure

Returns

Figure object

Return type

fig

validphys.closuretest.multiclosure_pdf module

multiclosure_pdf.py

Module containing all of the actions related to statistical estimators across multiple closure fits or proxy fits defined in PDF space. The actions in this module are used to produce results which are plotted in multiclosure_pdf_output.py

validphys.closuretest.multiclosure_pdf.bootstrap_pdf_differences(fits_xi_grid_values, underlying_xi_grid_values, multiclosure_underlyinglaw, rng)[source]

Generate a single bootstrap sample of pdf_central_difference and pdf_replica_difference given the multiclosure fits grid values (fits_xi_grid_values); the underlying law grid values and the underlying law; and a numpy random state which is used to generate random indices for bootstrap sample. The bootstrap does include repeats and has the same number of fits and replicas as the original fits_xi_grid_values which is being resampled.

Returns

pdf_difference – a tuple of 2 lists: the central differences and the replica differences. Each list is n_fits long and each element is a resampled differences array for a randomly selected fit, randomly selected replicas.

Return type

tuple

validphys.closuretest.multiclosure_pdf.fits_bootstrap_pdf_expected_xi(fits_bootstrap_pdf_sqrt_ratio)[source]

Using fits_bootstrap_pdf_sqrt_ratio calculate a bootstrap of the expected xi using the same procedure as in validphys.closuretest.multiclosure_output.expected_xi_from_bias_variance().

validphys.closuretest.multiclosure_pdf.fits_bootstrap_pdf_ratio(fits_xi_grid_values, underlying_xi_grid_values, multiclosure_underlyinglaw, multiclosure_nx=4, n_boot=100, boot_seed=1234)[source]

Perform a bootstrap sampling across fits and replicas of the sqrt ratio, by flavour and total and then tabulate the mean and error

validphys.closuretest.multiclosure_pdf.fits_bootstrap_pdf_sqrt_ratio(fits_bootstrap_pdf_ratio)[source]

Take the square root of fits_bootstrap_pdf_ratio

validphys.closuretest.multiclosure_pdf.fits_correlation_matrix_totalpdf(fits_covariance_matrix_totalpdf)[source]

Given the fits_covariance_matrix_totalpdf, returns the corresponding correlation matrix

validphys.closuretest.multiclosure_pdf.fits_covariance_matrix_by_flavour(fits_replica_difference)[source]

Given a set of PDF grids from multiple closure tests, obtain an estimate of the covariance matrix for each flavour separately, return as a list of covmats

validphys.closuretest.multiclosure_pdf.fits_covariance_matrix_totalpdf(fits_replica_difference, multiclosure_nx=4)[source]

Given a set of PDF grids from multiple closure tests, obtain an estimate of the covariance matrix allowing for correlations across flavours

validphys.closuretest.multiclosure_pdf.fits_pdf_flavour_ratio(fits_sqrt_covmat_by_flavour, fits_central_difference, fits_replica_difference)[source]

Calculate the bias (chi2 between central PDF and underlying PDF) for each flavour and the variance (mean chi2 between replica and central PDF), then return a numpy array with shape (flavours, 2) with second axis being bias, variance

validphys.closuretest.multiclosure_pdf.fits_pdf_total_ratio(fits_central_difference, fits_replica_difference, fits_covariance_matrix_totalpdf, multiclosure_nx=4)[source]

Calculate the total bias and variance for all flavours and x allowing for correlations across flavour.

Returns:

ratio_data: tuple

required data for calculating mean(bias) over mean(variance) across fits in form of tuple (bias, variance)

validphys.closuretest.multiclosure_pdf.fits_sqrt_covmat_by_flavour(fits_covariance_matrix_by_flavour)[source]

For each flavour covariance matrix calculate the sqrt covmat (cholesky lower triangular)

validphys.closuretest.multiclosure_pdf.internal_nonsinglet_xgrid(multiclosure_nx=4)[source]

Given the number of x points, set up the xgrid for flavours which are not singlet or gluon, defined as being linearly spaced points between 0.1 and 0.5

validphys.closuretest.multiclosure_pdf.internal_singlet_gluon_xgrid(multiclosure_nx=4)[source]

Given the number of x points, set up the singlet and gluon xgrids, which are defined as half the points being logarithmically spaced between 10^-3 and 0.1 and the other half of the points being linearly spaced between 0.1 and 0.5

validphys.closuretest.multiclosure_pdf.pdf_central_difference(xi_grid_values, underlying_xi_grid_values, multiclosure_underlyinglaw)[source]

Calculate the difference between underlying law and central PDF for, specifically:

underlying_grid - mean(grid_vals)

where mean is across replicas.

Returns:

diffs: np.array

array of diffs with shape (flavour, x)

validphys.closuretest.multiclosure_pdf.pdf_replica_difference(xi_grid_values)[source]

Calculate the difference between the central PDF and the replica PDFs, specifically:

mean(grid_vals) - grid_vals

where the mean is across replicas.

Returns:

diffs: np.array

array of diffs with shape (replicas, flavour, x)

validphys.closuretest.multiclosure_pdf.replica_and_central_diff_totalpdf(fits_replica_difference, fits_central_difference, fits_covariance_matrix_totalpdf, multiclosure_nx=4, use_x_basis=False)[source]

Calculate sigma and delta, like xi_flavour_x() but return before calculating xi.

validphys.closuretest.multiclosure_pdf.underlying_xi_grid_values(multiclosure_underlyinglaw: ~validphys.core.PDF, Q: (<class 'float'>, <class 'int'>), internal_singlet_gluon_xgrid, internal_nonsinglet_xgrid)[source]

Like xi_pdfgrids but setting the PDF as the underlying law, extracted from a set of fits

validphys.closuretest.multiclosure_pdf.xi_flavour_x(fits_replica_difference, fits_central_difference, fits_covariance_matrix_by_flavour, use_x_basis=False)[source]

For a set of fits calculate the indicator function

I_{[-sigma, sigma]}(delta)

where sigma is the RMS difference between central and replicas PDF and delta is the difference between central PDF and underlying law.

The differences are all rotated to basis which diagonalises the covariance matrix that was estimated from the super set of all fit replicas.

Finally take the mean across fits to get xi in flavour and x.

validphys.closuretest.multiclosure_pdf.xi_grid_values(xi_pdfgrids)[source]

Grid values from the xi_pdfgrids concatenated as single numpy array

validphys.closuretest.multiclosure_pdf.xi_pdfgrids(pdf: ~validphys.core.PDF, Q: (<class 'float'>, <class 'int'>), internal_singlet_gluon_xgrid, internal_nonsinglet_xgrid)[source]

Generate PDF grids which are required for calculating xi in PDF space in the NN31IC basis, excluding the charm. We want to specify different xgrids for different flavours to avoid sampling PDFs in deep extrapolation regions. The limits are chosen to achieve this and specifically they are chosen to be:

gluon and singlet: 10^-3 < x < 0.5 other non-singlets: 0.1 < x < 0.5

Returns

  • tuple of xplotting_grids, one for gluon and singlet and one for other

  • non-singlets

validphys.closuretest.multiclosure_pdf.xi_totalpdf(replica_and_central_diff_totalpdf)[source]

Like xi_flavour_x() except calculate the total xi across flavours and x accounting for correlations

validphys.closuretest.multiclosure_pdf_output module

multiclosure_pdf_output.py

Module containing all of the plots and tables for multiclosure estimators in PDF space.

validphys.closuretest.multiclosure_pdf_output.fits_bootstrap_pdf_compare_xi_to_expected(fits_bootstrap_pdf_expected_xi_table, fits_bootstrap_pdf_xi_table)[source]

Table comparing the mean and standard deviation across bootstrap samples of the measured value of xi to the value calculated from bias/variance in PDF space. This is done for each flavour and for the total across all flavours accounting for correlations.

validphys.closuretest.multiclosure_pdf_output.fits_bootstrap_pdf_expected_xi_table(fits_bootstrap_pdf_expected_xi)[source]

Tabulate the mean and standard deviation across bootstrap samples of fits_bootstrap_pdf_expected_xi() with a row for each flavour and the total expected xi.

validphys.closuretest.multiclosure_pdf_output.fits_bootstrap_pdf_sqrt_ratio_table(fits_bootstrap_pdf_sqrt_ratio)[source]

Tabulate the mean and standard deviation across bootstrap samples of the sqrt ratio of bias/variance in PDF space, with a row for each flavour and the total. For more information on the bootstrap sampling see fits_bootstrap_pdf_ratio().

validphys.closuretest.multiclosure_pdf_output.fits_bootstrap_pdf_xi_table(fits_xi_grid_values, underlying_xi_grid_values, multiclosure_underlyinglaw, multiclosure_nx=4, n_boot=100, boot_seed=1234, use_x_basis=False)[source]

Perform a bootstrap sampling across fits and replicas of xi, by flavour and total and then tabulate the mean and error.

validphys.closuretest.multiclosure_pdf_output.fits_pdf_bias_variance_ratio(fits_pdf_flavour_ratio, fits_pdf_total_ratio)[source]

Returns a table with the values of mean bias / mean variance with mean referring to mean across fits, by flavour. Includes total across all flavours allowing for correlations.

validphys.closuretest.multiclosure_pdf_output.fits_pdf_compare_xi_to_expected(fits_pdf_expected_xi_from_ratio, xi_flavour_table)[source]

Two-column table comparing the measured value of xi for each flavour to the value calculated from the bias/variance.

validphys.closuretest.multiclosure_pdf_output.fits_pdf_expected_xi_from_ratio(fits_pdf_sqrt_ratio)[source]

Like validphys.closuretest.multiclosure_output.expected_xi_from_bias_variance() but in PDF space. An estimate is made of the integral across the central difference distribution, with domain defined by the replica distribution. For more details see validphys.closuretest.multiclosure_output.expected_xi_from_bias_variance().

validphys.closuretest.multiclosure_pdf_output.fits_pdf_sqrt_ratio(fits_pdf_bias_variance_ratio)[source]

Like fits_pdf_bias_variance_ratio() except taking the sqrt. This is to see how faithful our uncertainty is in units of the standard deviation.

validphys.closuretest.multiclosure_pdf_output.plot_multiclosure_correlation_eigenvalues(fits_correlation_matrix_totalpdf)[source]

Plot scatter points for each of the eigenvalues from the estimated correlation matrix from the multiclosure PDFs in flavour and x.

In the legend add the ratio of the largest eigenvalue over the smallest eigenvalue, aka the l2 condition number of the correlation matrix.

validphys.closuretest.multiclosure_pdf_output.plot_multiclosure_correlation_matrix(fits_correlation_matrix_totalpdf, multiclosure_nx=4)[source]

Like plot_multiclosure_covariance_matrix but plots the total correlation matrix.

validphys.closuretest.multiclosure_pdf_output.plot_multiclosure_covariance_matrix(fits_covariance_matrix_totalpdf, multiclosure_nx=4)[source]

Plot the covariance matrix for all flavours. The covariance matrix has shape n_flavours * n_x, where each block is the covariance of the replica PDFs on the x-grid defined in xi_pdfgrids().

validphys.closuretest.multiclosure_pdf_output.plot_pdf_central_diff_histogram(replica_and_central_diff_totalpdf)[source]

Histogram of the difference between central PDF and underlying law normalised by the corresponding replica standard deviation for all points in x and flavour alongside a scaled Gaussian. Total xi is proportion of the histogram which falls within the central 1-sigma confidence interval.

validphys.closuretest.multiclosure_pdf_output.plot_pdf_matrix(matrix, n_x, **kwargs)[source]

Utility function which, given a covmat/corrmat for all flavours and x, plots it with appropriate labels. Input matrix is expected to be size (n_flavours*n_x) * (n_flavours*n_x).

Parameters
  • matrix (np.array) – square matrix which must be (n_flavours*n_x) * (n_flavours*n_x) with elements ordered like: (flavour0_x0, flavour0_x1, …, flavourN_x0, …, flavourN_xN) i.e. the points along x for flavour 0, then points along x for flavour 1 etc.

  • **kwargs – keyword arguments for the matplotlib.axes.Axes.imshow function

Notes

See matplotlib.axes.Axes.imshow for more details on the plotting function.

validphys.closuretest.multiclosure_pdf_output.plot_xi_flavour_x(xi_flavour_x, Q, internal_singlet_gluon_xgrid, internal_nonsinglet_xgrid, multiclosure_nx=4, use_x_basis=False)[source]

For each flavour plot xi for each x-point. By default xi is calculated and plotted in the basis which diagonalises the covmat, which is estimated from the union of all the replicas. However, if use_x_basis is True then xi will be calculated and plotted in the x-basis.

validphys.closuretest.multiclosure_pdf_output.xi_flavour_table(xi_flavour_x, xi_totalpdf)[source]

For each flavour take the mean of xi_flavour_x across x to get a single number, which is the proportion of points on the central PDF which are within 1 sigma. This is calculated from the replicas of the underlying PDF.

Returns

xi_flavour – table of xi by flavour

Return type

pd.DataFrame

validphys.closuretest.multiclosure_preprocessing module

multiclosure_preprocessing.py

Module containing all of the actions related to preprocessing exponents. In particular, comparing the next preprocessing exponents across the multiple closure fits with the previous effective exponents, to see if there is a big dependence on the level 1 shift.

validphys.closuretest.multiclosure_preprocessing.next_multiclosure_alpha_preprocessing_table(fits, fits_basis, fits_pdf, fits_fitbasis_alpha_lines)[source]

Returns a table with the next alpha preprocessing exponent for each fit with a multiindex column of flavour and next preprocessing range limits.

For more information see _next_multiclosure_preprocessing_table()

validphys.closuretest.multiclosure_preprocessing.next_multiclosure_beta_preprocessing_table(fits, fits_basis, fits_pdf, fits_fitbasis_beta_lines)[source]

Returns a table with the next beta preprocessing exponent for each fit with a multiindex column of flavour and next preprocessing range limits.

For more information see _next_multiclosure_preprocessing_table()

validphys.closuretest.multiclosure_preprocessing.plot_next_multiclosure_alpha_preprocessing(fits_fitbasis_alpha_lines, fits_pdf, next_multiclosure_alpha_preprocessing_table)[source]

Using the table produced by next_multiclosure_alpha_preprocessing_table(), plot the next alpha preprocessing exponent ranges. The ranges are represented by horizontal error bars, with vertical lines indicating the previous range limits of the first fit.

validphys.closuretest.multiclosure_preprocessing.plot_next_multiclosure_alpha_preprocessing_range_width(fits_fitbasis_alpha_lines, fits_pdf, next_multiclosure_alpha_preprocessing_table)[source]

Using the table produced by next_multiclosure_alpha_preprocessing_table(), plot the next alpha preprocessing exponent ranges width, aka max alpha - min alpha as a histogram over fits for each flavour. Add a vertical line of the previous range width of the first fit for reference

validphys.closuretest.multiclosure_preprocessing.plot_next_multiclosure_beta_preprocessing(fits_fitbasis_beta_lines, fits_pdf, next_multiclosure_beta_preprocessing_table)[source]

Using the table produced by next_multiclosure_beta_preprocessing_table(), plot the next beta preprocessing exponent ranges. The ranges are represented by horizontal error bars, with vertical lines indicating the previous range limits of the first fit.

validphys.closuretest.multiclosure_preprocessing.plot_next_multiclosure_beta_preprocessing_range_width(fits_fitbasis_beta_lines, fits_pdf, next_multiclosure_beta_preprocessing_table)[source]

Using the table produced by next_multiclosure_beta_preprocessing_table(), plot the next beta preprocessing exponent ranges width, aka max beta - min beta as a histogram over fits for each flavour. Add a vertical line of the previous range width of the first fit for reference

validphys.closuretest.multiclosure_pseudodata module

multiclosure_pseudodata

actions which load fit pseudodata and compute actions related to overfitting. Estimators here can only be calculated on data used in the fit.

validphys.closuretest.multiclosure_pseudodata.expected_data_delta_chi2(data_fits_cv, multiclosure_data_loader)[source]

For data, calculate the mean of delta chi2 across all fits, returns a tuple of number of data points and unnormalised delta chi2.

validphys.closuretest.multiclosure_pseudodata.expected_delta_chi2_table(groups_expected_delta_chi2, group_dataset_inputs_by_metadata, total_expected_data_delta_chi2)[source]

Tabulate the expectation value of delta chi2 across fits for groups with an additional row with the total across all data at the bottom.

validphys.closuretest.multiclosure_pseudodata.fits_dataset_cvs(fits_dataset)[source]

Internal function for loading the level one data for all fits for a single dataset. This function avoids the stringent metadata checks of the newer python commondata parser.

validphys.closuretest.multiclosure_pseudodata.total_expected_data_delta_chi2(exps_expected_delta_chi2)[source]

Takes expected_data_delta_chi2() evaluated for each experiment and then sums across experiments. Returns the total number of datapoints and unnormalised delta chi2.

Module contents

closuretest

module containing all actions specific to closure test