validphys package
Subpackages
- validphys.closuretest package
- Submodules
- validphys.closuretest.closure_checks module
check_at_least_10_fits()
check_fit_isclosure()
check_fits_areclosures()
check_fits_different_filterseed()
check_fits_have_same_basis()
check_fits_same_filterseed()
check_fits_underlying_law_match()
check_multifit_replicas()
check_t0pdfset_matches_law()
check_t0pdfset_matches_multiclosure_law()
check_use_fitcommondata()
- validphys.closuretest.closure_plots module
- validphys.closuretest.closure_results module
BiasData
VarianceData
bias_dataset()
bias_experiment()
biases_table()
bootstrap_bias_experiment()
bootstrap_variance_experiment()
delta_chi2_bootstrap()
delta_chi2_table()
fit_underlying_pdfs_summary()
fits_bootstrap_bias_table()
fits_bootstrap_variance_table()
summarise_closure_underlying_pdfs()
variance_dataset()
variance_experiment()
- validphys.closuretest.multiclosure module
BootstrappedTheoryResult
bias_variance_resampling_data()
bias_variance_resampling_dataset()
bias_variance_resampling_total()
data_replica_and_central_diff()
data_xi()
dataset_fits_bias_replicas_variance_samples()
dataset_inputs_fits_bias_replicas_variance_samples()
dataset_replica_and_central_diff()
dataset_xi()
expected_data_bias_variance()
expected_dataset_bias_variance()
expected_total_bias_variance()
experiments_bootstrap_expected_xi()
experiments_bootstrap_ratio()
experiments_bootstrap_sqrt_ratio()
fits_bootstrap_data_bias_variance()
fits_bootstrap_data_xi()
fits_data_bias_variance()
fits_dataset_bias_variance()
fits_total_bias_variance()
groups_bootstrap_expected_xi()
groups_bootstrap_ratio()
groups_bootstrap_sqrt_ratio()
internal_multiclosure_data_loader()
internal_multiclosure_dataset_loader()
n_fit_samples()
n_replica_samples()
total_bootstrap_ratio()
total_bootstrap_xi()
total_expected_xi_resample()
total_xi_resample()
xi_resampling_data()
xi_resampling_dataset()
- validphys.closuretest.multiclosure_output module
compare_measured_expected_xi()
dataset_ratio_error_finite_effects()
dataset_std_xi_error_finite_effects()
dataset_std_xi_means_finite_effects()
dataset_xi_error_finite_effects()
dataset_xi_means_finite_effects()
datasets_bias_variance_ratio()
expected_xi_from_bias_variance()
experiments_bias_variance_ratio()
experiments_bias_variance_table()
experiments_bootstrap_expected_xi_table()
experiments_bootstrap_sqrt_ratio_table()
experiments_bootstrap_xi_comparison()
experiments_bootstrap_xi_table()
fits_measured_xi()
groups_bootstrap_expected_xi_table()
groups_bootstrap_sqrt_ratio_table()
groups_bootstrap_xi_comparison()
groups_bootstrap_xi_table()
plot_bias_variance_distributions()
plot_data_central_diff_histogram()
plot_data_fits_bias_variance()
plot_data_xi()
plot_data_xi_histogram()
plot_dataset_fits_bias_variance()
plot_dataset_xi()
plot_dataset_xi_histogram()
plot_experiments_sqrt_ratio_bootstrap_distribution()
plot_experiments_xi_bootstrap_distribution()
plot_total_fits_bias_variance()
sqrt_datasets_bias_variance_ratio()
sqrt_experiments_bias_variance_ratio()
total_bias_variance_ratio()
total_expected_xi_error_finite_effects()
total_expected_xi_means_finite_effects()
total_ratio_error_finite_effects()
total_ratio_means_finite_effects()
total_std_xi_error_finite_effects()
total_std_xi_means_finite_effects()
total_xi_error_finite_effects()
total_xi_means_finite_effects()
- validphys.closuretest.multiclosure_pdf module
bootstrap_pdf_differences()
fits_bootstrap_pdf_expected_xi()
fits_bootstrap_pdf_ratio()
fits_bootstrap_pdf_sqrt_ratio()
fits_correlation_matrix_totalpdf()
fits_covariance_matrix_by_flavour()
fits_covariance_matrix_totalpdf()
fits_pdf_flavour_ratio()
fits_pdf_total_ratio()
fits_sqrt_covmat_by_flavour()
internal_nonsinglet_xgrid()
internal_singlet_gluon_xgrid()
pdf_central_difference()
pdf_replica_difference()
replica_and_central_diff_totalpdf()
underlying_xi_grid_values()
xi_flavour_x()
xi_grid_values()
xi_pdfgrids()
xi_totalpdf()
- validphys.closuretest.multiclosure_pdf_output module
fits_bootstrap_pdf_compare_xi_to_expected()
fits_bootstrap_pdf_expected_xi_table()
fits_bootstrap_pdf_sqrt_ratio_table()
fits_bootstrap_pdf_xi_table()
fits_pdf_bias_variance_ratio()
fits_pdf_compare_xi_to_expected()
fits_pdf_expected_xi_from_ratio()
fits_pdf_sqrt_ratio()
plot_multiclosure_correlation_eigenvalues()
plot_multiclosure_correlation_matrix()
plot_multiclosure_covariance_matrix()
plot_pdf_central_diff_histogram()
plot_pdf_matrix()
plot_xi_flavour_x()
xi_flavour_table()
- validphys.closuretest.multiclosure_preprocessing module
- validphys.closuretest.multiclosure_pseudodata module
- Module contents
- validphys.compareclosuretemplates package
- validphys.comparefittemplates package
- validphys.cuts package
- validphys.deltachi2templates package
- validphys.hyperplottemplates package
- validphys.mplstyles package
- validphys.paramfits package
- Submodules
- validphys.paramfits.config module
ParamfitsConfig
ParamfitsConfig.parse_blacklist_datasets()
ParamfitsConfig.parse_experiments_covmat_output()
ParamfitsConfig.parse_extra_sum()
ParamfitsConfig.parse_extra_sums()
ParamfitsConfig.parse_fits_chi2_paramfits_output()
ParamfitsConfig.parse_fits_computed_psedorreplicas_chi2_output()
ParamfitsConfig.parse_fits_computed_pseudoreplicas_chi2_output()
ParamfitsConfig.produce_combine_dataspecs_pseudoreplicas_as()
ParamfitsConfig.produce_combine_dataspecs_pseudorreplicas_as()
ParamfitsConfig.produce_fits_as()
ParamfitsConfig.produce_fits_as_from_fitdeclarations()
ParamfitsConfig.produce_fits_central_chi2_by_dataset_item()
ParamfitsConfig.produce_fits_central_chi2_by_experiment_and_dataset()
ParamfitsConfig.produce_fits_central_chi2_for_total()
ParamfitsConfig.produce_fits_matched_pseudoreplicas_chi2_by_dataset_item()
ParamfitsConfig.produce_fits_matched_pseudoreplicas_chi2_by_experiment_and_dataset()
ParamfitsConfig.produce_fits_matched_pseudoreplicas_chi2_output()
ParamfitsConfig.produce_fits_matched_pseudorreplicas_chi2_by_dataset_item()
ParamfitsConfig.produce_fits_matched_pseudorreplicas_chi2_by_experiment_and_dataset()
ParamfitsConfig.produce_fits_matched_pseudorreplicas_chi2_output()
ParamfitsConfig.produce_fits_name()
ParamfitsConfig.produce_fits_name_from_fitdeclarations()
ParamfitsConfig.produce_fits_pdf_config()
ParamfitsConfig.produce_fits_replica_data_correlated_for_total()
ParamfitsConfig.produce_matched_pseudoreplicas_for_total()
ParamfitsConfig.produce_matched_pseudorreplcias_for_total()
ParamfitsConfig.produce_use_fits_chi2_paramfits_output()
ParamfitsConfig.produce_use_fits_computed_psedorreplicas_chi2_output()
ParamfitsConfig.produce_use_fits_computed_pseudoreplicas_chi2_output()
- validphys.paramfits.dataops module
RobustSampleWrapper
StandardSampleWrapper
as_central_parabola()
as_determination_from_central_chi2()
as_determination_from_central_chi2_with_tag()
as_parabolic_coefficient_table()
bootstrapping_stats_error()
bootstrapping_stats_error_on_the_error()
check_dataset_items()
compare_aic()
compare_determinations_table()
compare_determinations_table_impl()
datasepecs_as_value_error_table_impl()
datasepecs_quad_table_impl()
dataspecs_as_central_parabolas_map()
dataspecs_as_value_error_table()
dataspecs_as_value_error_table_impl()
dataspecs_as_value_error_table_transposed()
dataspecs_chi2_by_dataset_dict()
dataspecs_matched_pseudoreplicas_chi2_table()
dataspecs_matched_pseudorreplicas_chi2_table()
dataspecs_ndata_table()
dataspecs_quad_table_impl()
dataspecs_quad_value_error_table()
dataspecs_stats_error_table()
dataspecs_stats_error_table_transposed()
derivative_dispersion_table()
discarded_mask()
fits_matched_pseudoreplicas_chi2_table()
fits_matched_pseudorreplicas_chi2_table()
fits_replica_data_with_discarded_replicas()
get_parabola()
half_sample_stats_error()
parabolic_as_determination()
parabolic_as_determination_with_tag()
pseudoreplicas_stats_error()
pseudorreplicas_stats_error()
quadratic_as_determination()
quadratic_as_determination_with_tag()
- validphys.paramfits.plots module
alphas_shift()
plot_as_central_parabola()
plot_as_cummulative_central_chi2()
plot_as_cummulative_central_chi2_diff()
plot_as_cummulative_central_chi2_diff_negative()
plot_as_cummulative_central_chi2_diff_underflow()
plot_as_datasets_compare()
plot_as_datasets_pseudoreplicas_chi2()
plot_as_datasets_pseudorreplicas_chi2()
plot_as_distribution()
plot_as_exepriments_central_chi2()
plot_as_value_error_central()
plot_dataspecs_as_value_error()
plot_dataspecs_as_value_error_comparing_with_central()
plot_dataspecs_central_parabolas()
plot_dataspecs_parabola_examples()
plot_dataspecs_pseudoreplica_means()
plot_dataspecs_pseudorreplica_means()
plot_fits_as_profile()
plot_fitted_replicas_as_profiles_matched()
plot_mean_pulls()
plot_poly_as_fit()
plot_pull_gaussian_fit_central()
plot_pull_gaussian_fit_pseudo()
plot_pull_plots_global_min()
plot_pulls_central()
plot_total_as_distribution_dataspecs()
- Module contents
- validphys.photon package
- validphys.plotoptions package
- Submodules
- validphys.plotoptions.core module
- validphys.plotoptions.kintransforms module
DIJET3DXQ2MapMixin
DIJETATLASXQ2MapMixin
DIJETXQ2MapMixin
DISXQ2MapMixin
DYMXQ2MapMixin
DYXQ2MapMixin
EWPTXQ2MapMixin
HQPTXQ2MapMixin
HQQPTXQ2MapMixin
JETXQ2MapMixin
Kintransform
SqrtScaleMixin
dijet_CMS_3D
dijet_CMS_5TEV
dijet_sqrt_scale
dijet_sqrt_scale_ATLAS
dis_sqrt_scale
dyp_sqrt_scale
ewj_jpt_sqrt_scale
ewj_jrap_sqrt_scale
ewj_mll_sqrt_scale
ewj_pt_sqrt_scale
ewj_ptrap_sqrt_scale
ewj_rap_sqrt_scale
ewk_mll_sqrt_scale
ewk_pseudorapity_sqrt_scale
ewk_pt_sqrt_scale
ewk_ptrap_sqrt_scale
ewk_rap_sqrt_scale
hig_rap_sqrt_scale
hqp_mqq_sqrt_scale
hqp_ptq_sqrt_scale
hqp_ptqq_sqrt_scale
hqp_yq_sqrt_scale
hqp_yqq_sqrt_scale
identity
inc_sqrt_scale
jet_sqrt_scale
nmc_process
pht_sqrt_scale
sia_sqrt_scale
- validphys.plotoptions.labelers module
- validphys.plotoptions.plottingoptions module
PlottingOptions
PlottingOptions.all_labels
PlottingOptions.already_digested
PlottingOptions.data_reference
PlottingOptions.dataset_label
PlottingOptions.experiment
PlottingOptions.extra_labels
PlottingOptions.figure_by
PlottingOptions.func_labels
PlottingOptions.kinematics_override
PlottingOptions.line_by
PlottingOptions.nnpdf31_process
PlottingOptions.normalize
PlottingOptions.parse_figure_by()
PlottingOptions.parse_line_by()
PlottingOptions.parse_x()
PlottingOptions.plot_x
PlottingOptions.process_description
PlottingOptions.result_transform
PlottingOptions.theory_reference
PlottingOptions.x
PlottingOptions.x_label
PlottingOptions.x_scale
PlottingOptions.y_label
PlottingOptions.y_scale
ResultTransformations
Scale
TransformFunctions
TransformFunctions.dijet_CMS_3D
TransformFunctions.dijet_CMS_5TEV
TransformFunctions.dijet_sqrt_scale
TransformFunctions.dijet_sqrt_scale_ATLAS
TransformFunctions.dis_sqrt_scale
TransformFunctions.dyp_sqrt_scale
TransformFunctions.ewj_jpt_sqrt_scale
TransformFunctions.ewj_jrap_sqrt_scale
TransformFunctions.ewj_mll_sqrt_scale
TransformFunctions.ewj_pt_sqrt_scale
TransformFunctions.ewj_ptrap_sqrt_scale
TransformFunctions.ewj_rap_sqrt_scale
TransformFunctions.ewk_mll_sqrt_scale
TransformFunctions.ewk_pseudorapity_sqrt_scale
TransformFunctions.ewk_pt_sqrt_scale
TransformFunctions.ewk_ptrap_sqrt_scale
TransformFunctions.ewk_rap_sqrt_scale
TransformFunctions.hig_rap_sqrt_scale
TransformFunctions.hqp_mqq_sqrt_scale
TransformFunctions.hqp_ptq_sqrt_scale
TransformFunctions.hqp_ptqq_sqrt_scale
TransformFunctions.hqp_yq_sqrt_scale
TransformFunctions.hqp_yqq_sqrt_scale
TransformFunctions.identity
TransformFunctions.inc_sqrt_scale
TransformFunctions.jet_sqrt_scale
TransformFunctions.nmc_process
TransformFunctions.pht_sqrt_scale
TransformFunctions.sia_sqrt_scale
- validphys.plotoptions.resulttransforms module
- validphys.plotoptions.utils module
- Module contents
- validphys.scalevariations package
- validphys.scripts package
- Submodules
- validphys.scripts.main module
- validphys.scripts.postfit module
- validphys.scripts.vp_checktheory module
- validphys.scripts.vp_comparefits module
CompareFitApp
CompareFitApp.add_positional_arguments()
CompareFitApp.complete_mapping()
CompareFitApp.get_commandline_arguments()
CompareFitApp.get_config()
CompareFitApp.interactive_author()
CompareFitApp.interactive_current_fit()
CompareFitApp.interactive_current_fit_label()
CompareFitApp.interactive_keywords()
CompareFitApp.interactive_reference_fit()
CompareFitApp.interactive_reference_fit_label()
CompareFitApp.interactive_thcovmat_if_present()
CompareFitApp.interactive_title()
CompareFitApp.try_complete_args()
main()
- validphys.scripts.vp_deltachi2 module
- validphys.scripts.vp_fitrename module
- validphys.scripts.vp_get module
- validphys.scripts.vp_hyperoptplot module
- validphys.scripts.vp_list module
- validphys.scripts.vp_nextfitruncard module
- validphys.scripts.vp_pdffromreplicas module
- validphys.scripts.vp_pdfrename module
- validphys.scripts.vp_upload module
- validphys.scripts.wiki_upload module
- Module contents
- validphys.tests package
- Subpackages
- Submodules
- validphys.tests.conftest module
- validphys.tests.test_alpha_s_bundle_pdf module
- validphys.tests.test_arclengths module
- validphys.tests.test_calcutils module
- validphys.tests.test_closuretest module
- validphys.tests.test_commondataparser module
- validphys.tests.test_core module
- validphys.tests.test_covmatreg module
- validphys.tests.test_covmats module
- validphys.tests.test_cuts module
- validphys.tests.test_datafiles module
- validphys.tests.test_effexponents module
- validphys.tests.test_filter_rules module
- validphys.tests.test_fitdata module
- validphys.tests.test_fitveto module
- validphys.tests.test_loader module
- validphys.tests.test_mc2hessian module
- validphys.tests.test_metaexps module
- validphys.tests.test_multiclosure module
- validphys.tests.test_overfit_metric module
- validphys.tests.test_plots module
- validphys.tests.test_postfit module
- validphys.tests.test_pseudodata module
- validphys.tests.test_pyfkdata module
- validphys.tests.test_pythonmakereplica module
- validphys.tests.test_regressions module
- validphys.tests.test_results module
- validphys.tests.test_sumrules module
- validphys.tests.test_tableloader module
- validphys.tests.test_theorydbutils module
- validphys.tests.test_totalchi2 module
- validphys.tests.test_utils module
- validphys.tests.test_vplistscript module
- validphys.tests.test_weights module
- Module contents
- validphys.theorycovariance package
- Submodules
- validphys.theorycovariance.construction module
ProcessInfo
combine_by_type()
compute_covs_pt_prescrip()
covmat_3fpt()
covmat_3pt()
covmat_3rpt()
covmat_5barpt()
covmat_5pt()
covmat_7pt()
covmat_7pt_orig()
covmat_9pt()
covmat_n3lo_ad()
covmat_n3lo_fhmv()
covmat_n3lo_singlet()
covs_pt_prescrip()
experimentplustheory_corrmat_custom()
fromfile_covmat()
procs_index_matched()
theory_corrmat_custom()
theory_covmat_custom()
theory_covmat_custom_fitting()
theory_covmat_dataset()
theory_normcovmat_custom()
total_theory_covmat()
total_theory_covmat_fitting()
user_covmat()
user_covmat_fitting()
- validphys.theorycovariance.output module
matrix_plot_labels()
plot_corrmat_heatmap()
plot_covmat_heatmap()
plot_diag_cov_comparison()
plot_diag_cov_comparison_by_experiment()
plot_diag_cov_comparison_by_process()
plot_expcorrmat_heatmap()
plot_expplusthcorrmat_heatmap_custom()
plot_normexpcovmat_heatmap()
plot_normthcovmat_heatmap_custom()
plot_thcorrmat_heatmap_custom()
- validphys.theorycovariance.tests module
LabeledShifts
alltheory_vector()
concatenated_shx_vector()
dataset_alltheory()
deltamiss_plot()
diagdf_theory_covmat()
doubleindex_set_byprocess()
doubleindex_thcovmat()
efficiency()
eigenvector_plot()
evals_nonzero_basis()
fnorm_shifts_byprocess()
fnorm_shifts_ordered()
ordered_alltheory_vector()
projected_condition_num()
projector_eigenvalue_ratio()
shift_diag_cov_comparison()
shift_vector()
sqrtdiags_thcovmat_byprocess()
theory_covmat_eigenvalues()
theory_shift_test()
theta()
ticklocs_thcovmat()
tripleindex_thcovmat_complete()
validation_theory_chi2()
vectors_3pt()
vectors_5barpt()
vectors_5pt()
vectors_7pt()
vectors_9pt()
- validphys.theorycovariance.theorycovarianceutils module
- Module contents
Submodules
validphys.api module
api.py
This module contains the reportengine programmatic API, initialized with the validphys providers, Config and Environment.
Example:
Simple Usage:
>> from validphys.api import API >> fig = API.plot_pdfs(pdf=”NNPDF_nlo_as_0118”, Q=100) >> fig.show()
validphys.app module
app.py
Mainloop of the validphys application. Here we define tailoted extensions to the reporthengine application (such as extra command line flags). Additionally the provider modules that serve as source to the validphys actions are declared here.
The entry point of the validphys application is the main
funcion of this
module.
- class validphys.app.App(name='validphys', providers=['validphys.results', 'validphys.commondata', 'validphys.pdfgrids', 'validphys.pdfplots', 'validphys.dataplots', 'validphys.fitdata', 'validphys.arclength', 'validphys.sumrules', 'validphys.reweighting', 'validphys.kinematics', 'validphys.correlations', 'validphys.chi2grids', 'validphys.eff_exponents', 'validphys.asy_exponents', 'validphys.paramfits.dataops', 'validphys.paramfits.plots', 'validphys.theorycovariance.construction', 'validphys.theorycovariance.output', 'validphys.theorycovariance.tests', 'validphys.replica_selector', 'validphys.closuretest', 'validphys.mc_gen', 'validphys.theoryinfo', 'validphys.pseudodata', 'validphys.renametools', 'validphys.covmats', 'validphys.hyperoptplot', 'validphys.deltachi2', 'validphys.n3fit_data', 'validphys.mc2hessian', 'reportengine.report', 'validphys.overfit_metric'])[source]
Bases:
App
- property argparser
- critical_message = 'A critical error occurred. This is likely due to one of the following reasons:\n\n - A bug in validphys.\n - Corruption of the provided resources (e.g. incorrect plotting files).\n - Cosmic rays hitting your CPU and altering the registers.\n\nThe traceback above should help determine the cause of the problem. If you\nbelieve this is a bug in validphys (please discard the cosmic rays first),\nplease open an issue on GitHub<https://github.com/NNPDF/nnpdf/issues>,\nincluding the contents of the following file:\n\n%s\n'
- property default_style
- environment_class
alias of
Environment
validphys.arclength module
arclength.py
Module for the computation and presentation of arclengths.
- class validphys.arclength.ArcLengthGrid(pdf, basis, flavours, stats)
Bases:
tuple
- basis
Alias for field number 1
- flavours
Alias for field number 2
- pdf
Alias for field number 0
- stats
Alias for field number 3
- validphys.arclength.arc_length_table(arc_lengths)[source]
Return a table with the descriptive statistics of the arc lengths over members of the PDF.
- validphys.arclength.arc_lengths(pdf: ~validphys.core.PDF, Q: ~numbers.Real, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>) = 'flavour', flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Compute arc lengths at scale Q
set up a grid with three segments and compute the arclength for each segment. Note: the variation of the PDF over the grid is computed by computing the forward differences between adjacent grid points.
- Parameters
pdf (validphys.core.PDF object) –
Q (float) – scale at which to evaluate PDF
basis (default = "flavour") –
flavours (default = None) –
- Returns
validphys.arclength.ArcLengthGrid object
object that contains the PDF, basis, flavours, and computed
arc length statistics.
- validphys.arclength.integrability_number(pdf: ~validphys.core.PDF, Q: ~numbers.Real, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>) = 'evolution', flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Return sum_i |x_i*f(x_i)|, x_i = {1e-9, 1e-8, 1e-7} for selected flavours
validphys.asy_exponents module
Tools for computing and plotting asymptotic exponents.
- class validphys.asy_exponents.AsyExponentBandPlotter(exponent, *args, **kwargs)[source]
Bases:
BandPDFPlotter
Class inheriting from BandPDFPlotter, changing title and ylabel to reflect the asymptotic exponent being plotted.
- validphys.asy_exponents.alpha_asy(pdf: ~validphys.core.PDF, *, xmin: ~numbers.Real = 1e-06, xmax: ~numbers.Real = 0.001, npoints: int = 100, Q: ~numbers.Real = 1.65, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Returns a list of xplotting_grids containing the value of the asymptotic exponent alpha, as defined by the first relationship in Eq. (4) of [arXiv:1604.00024], at the specified value of Q (in GeV), in the interval [xmin, xmax].
basis: Is one of the bases defined in pdfbases.py. This includes ‘flavour’ and ‘evolution’.
flavours: A set of elements from the basis. If None, the defaults for that basis will be selected.
npoints: the number of sub-intervals in the range [xmin, xmax] on which the derivative is computed.
- validphys.asy_exponents.asymptotic_exponents_table(pdf: ~validphys.core.PDF, *, x_alpha: ~numbers.Real = 1e-06, x_beta: ~numbers.Real = 0.9, Q: ~numbers.Real = 1.65, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None, npoints=100)[source]
Returns a table with the values of the asymptotic exponents alpha and beta, as defined in Eq. (4) of [arXiv:1604.00024], at the specified value of x and Q.
basis: Is one of the bases defined in pdfbases.py. This includes ‘flavour’ and ‘evolution’.
flavours: A set of elements from the basis. If None, the defaults for that basis will be selected.
npoints: the number of sub-intervals in the range [xmin, xmax] on which the derivative is computed.
- validphys.asy_exponents.beta_asy(pdf, *, xmin: ~numbers.Real = 0.6, xmax: ~numbers.Real = 0.9, npoints: int = 100, Q: ~numbers.Real = 1.65, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Returns a list of xplotting_grids containing the value of the asymptotic exponent beta, as defined by the second relationship in Eq. (4) of [arXiv:1604.00024], at the specified value of Q (in GeV), in the interval [xmin, xmax].
basis: Is one of the bases defined in pdfbases.py. This includes ‘flavour’ and ‘evolution’.
flavours: A set of elements from the basis. If None, the defaults for that basis will be selected.
npoints: the number of sub-intervals in the range [xmin, xmax] on which the derivative is computed.
validphys.calcutils module
calcutils.py
Low level utilities to calculate χ² and such. These are used to implement the higher level functions in results.py
- validphys.calcutils.all_chi2(results)[source]
Return the chi² for all elements in the result, regardless of the Stats class Note that the interpretation of the result will depend on the PDF error type
- validphys.calcutils.all_chi2_theory(results, totcov)[source]
Like all_chi2 but here the chi² are calculated using a covariance matrix that is the sum of the experimental covmat and the theory covmat.
- validphys.calcutils.bootstrap_values(data, nresamples, *, boot_seed: Optional[int] = None, apply_func: Optional[Callable] = None, args=None)[source]
General bootstrap sample
data is the data which is to be sampled, replicas is assumed to be on the final axis e.g N_bins*N_replicas
boot_seed can be specified if the user wishes to be able to take exact same bootstrap samples multiple times, as default it is set as None, in which case a random seed is used.
If just data and nresamples is provided, then bootstrap_values creates N resamples of the data, where each resample is a Monte Carlo selection of the data across replicas. The mean of each resample is returned
Alternatively, the user can specify a function to be sampled apply_func plus any additional arguments required by that function. bootstrap_values then returns apply_func(bootstrap_data, *args) where bootstrap_data.shape = (data.shape, nresamples). It is critical that apply_func can handle data input in this format.
- validphys.calcutils.calc_chi2(sqrtcov, diffs)[source]
Elementary function to compute the chi², given a Cholesky decomposed lower triangular part and a vector of differences.
- Parameters
sqrtcov (matrix) – A lower tringular matrix corresponding to the lower part of the Cholesky decomposition of the covariance matrix.
diffs (array) – A vector of differences (e.g. between data and theory). The first dimenssion must match the shape of sqrtcov. The computation will be broadcast over the other dimensions.
- Returns
chi2 – The result of the χ² for each vector of differences. Will have the same shape as
diffs.shape[1:]
.- Return type
array
Notes
This function computes the χ² more efficiently and accurately than following the direct definition of inverting the covariance matrix, \(\chi^2 = d\Sigma^{-1}d\), by solving the triangular linear system instead.
Examples
>>> from validphys.calcutils import calc_chi2 >>> import numpy as np >>> import scipy.linalg as la >>> np.random.seed(0) >>> diffs = np.random.rand(10) >>> s = np.random.rand(10,10) >>> cov = s@s.T >>> calc_chi2(la.cholesky(cov, lower=True), diffs) 44.64401691354948 >>> diffs@la.inv(cov)@diffs 44.64401691354948
- validphys.calcutils.calc_phi(sqrtcov, diffs)[source]
Low level function which calculates phi given a Cholesky decomposed lower triangular part and a vector of differences. Primarily used when phi is to be calculated independently from chi2.
The vector of differences diffs is expected to have N_bins on the first axis
- validphys.calcutils.central_chi2(results)[source]
Calculate the chi² from the central value of the theory prediction to the data
- validphys.calcutils.central_chi2_theory(results, totcov)[source]
Like central_chi2 but here the chi² is calculated using a covariance matrix that is the sum of the experimental covmat and the theory covmat.
- validphys.calcutils.get_df_block(matrix: DataFrame, key: str, level)[source]
Given a pandas dataframe whose index and column keys match, and data represents a symmetric matrix returns a diagonal block of this matrix corresponding to matrix`[key, key`] as a numpy array
addtitionally, the user can specify the level of the key for which the cross section is being taken, by default it is set to 1 which corresponds to the dataset level of a theory covariance matrix
- validphys.calcutils.regularize_covmat(covmat: array, norm_threshold=4)[source]
Given a covariance matrix, performs a regularization which is equivalent to performing regularize_l2 on the sqrt of covmat: the l2 norm of the inverse of the correlation matrix calculated from covmat is set to be less than or equal to norm_threshold. If the input covmat already fulfills this criterion it is returned.
- Parameters
covmat (array) – a covariance matrix which is to be regularized.
norm_threshold (float) – The acceptable l2 norm of the sqrt correlation matrix, by default set to 4.
- Returns
new_covmat – A new covariance matrix which has been regularized according to prescription above.
- Return type
array
- validphys.calcutils.regularize_l2(sqrtcov, norm_threshold=4)[source]
Return a regularized version of sqrtcov.
Given sqrtcov an (N, nsys) matrix, such that it’s gram matrix is the covariance matrix (covmat = sqrtcov@sqrtcov.T), first decompose it like
sqrtcov = D@A
, where D is a positive diagonal matrix of standard deviations and A is the “square root” of the correlation matrix,corrmat = A@A.T
. Then produce a new version of A which removes the unstable behaviour and assemble a new square root covariance matrix, which is returned.The stability condition is controlled by norm_threshold. It is
\[\left\Vert A^+ \right\Vert_{L2} \leq \frac{1}{\text{norm_threshold}}\]A+ is the pseudoinverse of A, norm_threshold roughly corresponds to the sqrt of the maximimum relative uncertainty in any systematic.
- Parameters
sqrtcov (2d array) – An (N, nsys) matrix specifying the uncertainties.
norm_threshold (float) – The tolerance for the regularization.
- Returns
newsqrtcov – A regularized version of sqrtcov.
- Return type
2d array
validphys.checks module
Created on Thu Jun 2 19:35:40 2016
@author: Zahari Kassabov
- validphys.checks.check_darwin_single_process(NPROC)[source]
Check that if we are on macOS (platform is Darwin), NPROC is equal to 1. This is related to the infamous issues with multiprocessing on macOS.
The “solution” is to run the code sequentially if NPROC is 1 and enforce that macOS users don’t set NPROC as anything else.
TODO: Once pseudodata is generated in python, try using spawn instead of fork with multiprocessing.
Notes
for the specific NNPDF issue: https://github.com/NNPDF/nnpdf/issues/931
General discussion: https://wefearchange.org/2018/11/forkmacos.rst.html
- validphys.checks.check_dataspecs_fits_different(dataspecs_fit)[source]
Need this check because oterwise the pandas object gets confused
- validphys.checks.check_fits_different(fits)[source]
Need this check because oterwise the pandas object gets confused
- validphys.checks.check_mixband_as_replicas(pdfs, mixband_as_replicas)[source]
Same as check_pdfs_noband, but for the mixband_as_replicas key. Allows mixband_as_replicas to be specified as a list of PDF IDs or a list of PDF indexes (starting from one).
- validphys.checks.check_pdf_normalize_to(pdfs, normalize_to)[source]
Transforn normalize_to into an index.
- validphys.checks.check_pdfs_noband(pdfs, pdfs_noband)[source]
Allows pdfs_noband to be specified as a list of PDF IDs or a list of PDF indexes (starting from one).
- validphys.checks.check_scale(scalename, allow_none=False)[source]
Check that we have a valid matplotlib scale. With allow_none=True, also None is valid.
- validphys.checks.check_speclabels_different(dataspecs_speclabel)[source]
This is needed for grouping dataframes (and because generally indecated a bug)
validphys.chi2grids module
chi2grids.py
Compute and store χ² data from replicas, possibly keeping the correlations between pseudorreplica fluctuations between different fits. This is applied here to parameter determinations such as those of αs.
- validphys.chi2grids.PseudoReplicaExpChi2Data
alias of
PseudoReplicaChi2Data
- validphys.chi2grids.computed_pseudoreplicas_chi2(fitted_make_replicas, group_result_table_no_table, groups_sqrtcovmat)[source]
Return a dataframe with the chi² of each replica with its corresponding pseudodata (i.e. the one it was fitted with). The chi² is computed by group. The index of the output dataframe is
['group', 'ndata' , 'nnfit_index']
where
nnftix_index
is the name of the corresponding replica
validphys.commondata module
commondata.py
Module containing actions which return loaded commondata, leverages utils
found in validphys.commondataparser
, and returns objects from
validphys.coredata
- validphys.commondata.loaded_commondata_with_cuts(commondata, cuts)[source]
Load the commondata and apply cuts.
- Parameters
commondata (validphys.core.CommonDataSpec) – commondata to load and cut.
cuts (validphys.core.cuts, None) – valid cuts, used to cut loaded commondata.
- Returns
loaded_cut_commondata
- Return type
validphys.commondataparser module
This module implements parsers for commondata and its associated metadata and uncertainties files
into useful structures that can be fed to the main validphys.coredata.CommonData
class.
A CommonData file is completely defined by a dataset name (which defines the folder in which the information is) and observable name (which defines the specific data, fktables and plotting settings to read).
<experiment>_<process>_<energy>{_<extras>}_<observable>
Where the folder name is <experiment>_<process>_<energy>{_<extras>}
The definition of all information for a given dataset (and all its observable) is in the
metadata.yaml
file and its implemented_observables
.
This module defines a number of parsers using the validobj
library.
The full metadata.yaml
is read as a SetMetaData
object
which contains a list of ObservableMetaData
.
These ObservableMetaData
are the “datasets” of NNPDF for all intents and purposes.
The parent SetMetaData
collects some shared variables such as the version of the dataset,
arxiv, inspire or hepdata ids, the folder in which the data is, etc.
The main class in this module is thus ObservableMetaData
which holds _all_ information
about the particular dataset-observable that we are interested in (and a reference to its parent).
- Inside the
ObservableMetaData
we can find: TheoryMeta
: contains the necessary information to read the (new style) fktablesKinematicsMeta
: containins metadata about the kinematicsPlottingOptions
: plotting style and information for validphysVariant
: variant to be used
The CommonMetaData defines how the CommonData file is to be loaded,
by modifying the CommonMetaData using one of the loaded Variants one can change the resulting
validphys.coredata.CommonData
object.
- class validphys.commondataparser.CommonDataMetadata(name: str, nsys: int, ndata: int, process_type: str)[source]
Bases:
object
Contains metadata information about the data being read
- class validphys.commondataparser.ObservableMetaData(observable_name: str, observable: dict, ndata: int, plotting: validphys.plotoptions.plottingoptions.PlottingOptions, process_type: Annotated[Union[validphys.process_options._Process, str], InputType(Any), Validator(<function ValidProcess at 0x7f3344c04e50>)], kinematic_coverage: list[str], kinematics: validphys.commondataparser.ValidKinematics, data_uncertainties: list[typing.Annotated[pathlib.Path, InputType(<class 'str'>), Validator(<function ValidPath at 0x7f3344a93790>)]], data_central: Optional[Annotated[pathlib.Path, InputType(<class 'str'>), Validator(<function ValidPath at 0x7f3344a93790>)]] = None, theory: Optional[validphys.commondataparser.TheoryMeta] = None, tables: Optional[list] = <factory>, npoints: Optional[list] = <factory>, variants: Optional[Dict[str, validphys.commondataparser.Variant]] = <factory>, applied_variant: Optional[str] = None, ported_from: Optional[str] = None, _parent: Optional[Any] = None)[source]
Bases:
object
- apply_variant(variant_name)[source]
Return a new instance of this class with the variant applied
This class also defines how the variant is applied to the commondata
- check()[source]
Various checks to apply manually to the observable before it is used anywhere These are not part of the __post_init__ call since they can only happen after the metadata has been read, the observable selected and (likely) variants applied.
- property cm_energy
- data_uncertainties: list[pathlib.Path]
- digest_plotting_variable(variable)[source]
Digest plotting variables in the
line_by
orfigure_by
fields and return the appropiatekX
or other label such that the plotting functions of validphys can understand it.These might be variables included as part of the kinematics or extra labels defined in the plotting dictionary.
- property experiment
- property is_integrability
- property is_lagrange_multiplier
- property is_ported_dataset
Return True if this is an automatically ported dataset that has not been updated
- property is_positivity
- kinematics: ValidKinematics
- property kinlabels
Return the kinematic labels in the same order as they are set in
kinematic_coverage
(which in turns follow the key kinematic_coverage) If this is a ported dataset, rely on the process type using the legacy labels
- load_data_central()[source]
Loads the data for this commondata returns a dataframe
- Returns
a dataframe containing the data
- Return type
pd.DataFrame
- load_kinematics(fill_to_three=True, drop_minmax=True)[source]
Returns a dataframe with the kinematic information
- load_uncertainties()[source]
Returns a dataframe with all appropiate uncertainties
- Returns
a dataframe containing the uncertainties
- Return type
pd.DataFrame
- property name
- property nnpdf_metadata
- property path_data_central
- property path_kinematics
- property paths_uncertainties
- plotting: PlottingOptions
- property plotting_options
- property process
- property setname
- theory: Optional[TheoryMeta] = None
- class validphys.commondataparser.SetMetaData(setname: str, version: int, version_comment: str, nnpdf_metadata: dict, implemented_observables: list[validphys.commondataparser.ObservableMetaData], arXiv: Optional[ValidReference] = None, iNSPIRE: Optional[ValidReference] = None, hepdata: Optional[ValidReference] = None)[source]
Bases:
object
Metadata of the whole set
- property allowed_datasets
Return the implemented datasets as a list <setname>_<observable>
- property allowed_observables
observable} dictionary
- Type
Returns the implemented observables as a {observable_name.upper()
- arXiv: Optional[ValidReference] = None
- property cm_energy
Return the center of mass energy as GeV if it can be understood from the name otherwise return None
- property folder
- hepdata: Optional[ValidReference] = None
- iNSPIRE: Optional[ValidReference] = None
- implemented_observables: list[validphys.commondataparser.ObservableMetaData]
- class validphys.commondataparser.TheoryMeta(FK_tables: list[tuple], operation: str = 'NULL', conversion_factor: float = 1.0, shifts: Optional[dict] = None, normalization: Optional[dict] = None, comment: Optional[str] = None)[source]
Bases:
object
Contains the necessary information to load the associated fktables
The theory metadata must always contain a key
FK_tables
which defines the fktables to be loaded. TheFK_tables
is organized as a double list such that:The inner list is concatenated In practice these are different fktables that might refer to the same observable but that are divided in subgrids for practical reasons. The outer list instead are the operands for whatever operation needs to be computed in order to match the experimental data.
In addition there are other flags that can affect how the fktables are read or used: - operation: defines the operation to apply to the outer list - shifts: mapping with the single fktables and their respective shifts
useful to create “gaps” so that the fktables and the respective experimental data are ordered in the same way (for instance, when some points are missing from a grid)
This class is inmutable, what is read from the commondata metadata should be considered final
Example
>>> from validphys.commondataparser import TheoryMeta ... from validobj import parse_input ... from reportengine.compat import yaml ... theory_raw = ''' ... FK_tables: ... - - fk1 ... - - fk2 ... - fk3 ... operation: ratio ... ''' ... theory = yaml.safe_load(theory_raw) ... parse_input(theory, TheoryMeta) TheoryMeta(FK_tables=[['fk1'], ['fk2', 'fk3']], operation='RATIO', shifts = None, conversion_factor=1.0, comment=None, normalization=None))
- fktables_to_paths(grids_folder)[source]
Given a source for pineappl grids, constructs the lists of fktables to be loaded
- class validphys.commondataparser.ValidKinematics(file: Path, variables: Dict[str, ValidVariable])[source]
Bases:
object
Contains the metadata necessary to load the kinematics of the dataset. The variables should be a dictionary with the key naming the variable and the content complying with the
ValidVariable
spec.Only the kinematics defined by the key
kinematic_coverage
will be loaded, which must be three.Three shall be the number of the counting and the number of the counting shall be three. Four shalt thou not count, neither shalt thou count two, excepting that thou then proceedeth to three. Once the number three, being the number of the counting, be reached, then the kinematics be loaded in the direction of thine validobject.
- apply_label(var, value)[source]
For a given value for a given variable, return the labels as label = value (unit) If the variable is not included in the list of variables, returns None as the variable could’ve been transformed by a kinematic transformation
- get_label(var)[source]
For the given variable, return the label as label (unit) If the label is an “extra” return the last one
- variables: Dict[str, ValidVariable]
- class validphys.commondataparser.ValidReference(url: str, version: ~typing.Optional[int] = None, journal: ~typing.Optional[str] = None, tables: list[int] = <factory>)[source]
Bases:
object
Holds literature information for the dataset
- class validphys.commondataparser.ValidVariable(label: str, description: str = '', units: str = '')[source]
Bases:
object
Defines the variables
- class validphys.commondataparser.Variant(data_uncertainties: Optional[list[pathlib.Path]] = None, theory: Optional[TheoryMeta] = None, data_central: Optional[Path] = None)[source]
Bases:
object
The new commondata format allow the usage of variants A variant can overwrite a number of keys, as defined by this dataclass
- data_uncertainties: Optional[list[pathlib.Path]] = None
- theory: Optional[TheoryMeta] = None
- validphys.commondataparser.get_kinlabel_key(process_label)[source]
Since there is no 1:1 correspondence between latex keys and the old libNNPDF names we match the longest key such that the proc label starts with it.
- validphys.commondataparser.get_plot_kinlabels(commondata)[source]
Return the LaTex kinematic labels for a given Commondata
- validphys.commondataparser.load_commondata(spec)[source]
Load the data corresponding to a CommonDataSpec object. Returns an instance of CommonData
- validphys.commondataparser.load_commondata_new(metadata)[source]
TODO: update this docstring since now the load_commondata_new takes the information from the metadata, and the name -> split is done outside
In the current iteration of the commondata, each of the commondata (i.e., an observable from a data publication) correspond to one single observable inside a folder which is named as “<experiment>_<process>_<energy>_<extra>” The observable is defined by a last suffix of the form “_<obs>” so that the full name of the dataset is always:
“<experiment>_<process>_<energy>{_<extra>}_<obs>”
where <extra> is optional.
This function right now works under the assumotion that the folder/observable is separated in the last _ so that:
folder_name = <experiment>_<process>_<energy>{_<extra>}
but note that this convention is still not fully defined.
This function returns a commondata object constructed by parsing the metadata.
Once a variant is selected, it can no longer be changed
Note that this function reproduces parse_commondata below, which parses the _old_ file format
- validphys.commondataparser.load_commondata_old(commondatafile, systypefile, setname)[source]
Parse a commondata file and a systype file into a CommonData.
- Parameters
commondatafile (file or path to file) –
systypefile (file or path to file) –
- Returns
commondata – An object containing the data and information from the commondata and systype files.
- Return type
- validphys.commondataparser.parse_new_metadata(metadata_file, observable_name, variant=None)[source]
Given a metadata file in the new format and the specific observable to be read load and parse the metadata and select the observable. If any variants are selected, apply them.
The triplet (metadata_file, observable_name, variant) define unequivocally the information to be parsed from the commondata library
validphys.config module
- class validphys.config.Config(input_params, environment=None)[source]
Bases:
Config
,CoreConfig
,ParamfitsConfig
The effective configuration parser class.
- class validphys.config.CoreConfig(input_params, environment=None)[source]
Bases:
Config
- property loader
- parse_added_filter_rules(rules: (<class 'list'>, <class 'NoneType'>) = None)[source]
Returns a tuple of AddedFilterRule objects. Rules are immutable after parsing. AddedFilterRule objects inherit from FilterRule objects.
- parse_additional_errors(bool)[source]
PDF set used to generate the photon additional errors: they are constructed using the replicas 101-107 of the PDF set LUXqed17_plus_PDF4LHC15_nnlo_100 (that are obtained varying some parameters of the LuxQED approach) in the way described in sec. 2.5 of https://arxiv.org/pdf/1712.07053.pdf
- parse_cut_similarity_threshold(th: Real)[source]
Maximum relative ratio when using fromsimilarpredictons cuts.
- parse_data_grouping(key)[source]
a key which indicates which default grouping to use. Mainly for internal use. It allows the default grouping of experiment to be applied to runcards which don’t specify metadata_group without there being a namespace conflict in the lockfile
- parse_dataset_input(dataset: Mapping)[source]
The mapping that corresponds to the dataset specifications in the fit files
- This mapping is such that
- dataset: str
name of the dataset to load
- variant: str
variant of the dataset to load
- cfac: list
list of cfactors to apply
- frac: float
fraction of the data to consider for training purposes
- weight: float
extra weight to give to the dataset
- custom_group: str
custom group to apply to the dataset
Note that the sys key is deprecated and allowed only for old-format dataset.
Old-format commondata will be translated to the new version in this function.
- parse_default_filter_rules_recorded_spec_(spec)[source]
This function is a hacky fix for parsing the recorded spec of filter rules. The reason we need this function is that without it reportengine detects a conflict in the dataset key.
- parse_experiment(experiment: dict)[source]
A set of datasets where correlated systematics are taken into account. It is a mapping where the keys are the experiment name ‘experiment’ and a list of datasets.
- parse_experiment_input(ei: dict)[source]
The mapping that corresponds to the experiment specification in the fit config files. Currently, this needs to be combined with
experiment_from_input
to yield an useful result.
- parse_filter_defaults(filter_defaults: (<class 'dict'>, <class 'NoneType'>))[source]
A mapping containing the default kinematic limits to be used when filtering data (when using internal cuts). Currently these limits are
q2min
,w2min
, andmaxTau
.- Parameters
filter_defaults (dict, None) – A mapping containing the default kinematic limits to be used when filtering data (when using internal cuts). Currently these limits are
q2min
,w2min
, andmaxTau
.- Returns
A hashable object containing the default kinematic limits to be used when filtering data (when using internal cuts). Currently these limits are
q2min
,w2min
, andmaxTau
.- Return type
- parse_filter_rules(filter_rules: (<class 'list'>, <class 'NoneType'>))[source]
A tuple of FilterRule objects. Rules are immutable after parsing. See https://docs.nnpdf.science/vp/filters.html for details on the syntax
- parse_fit(item)[source]
A fit in the results folder, containing at least a valid filter result. Either just an id (str), or a mapping with ‘id’ and ‘label’.
- parse_fitdeclaration(label: str)[source]
Used to guess some informtion from the fit name, without having to download it. This is meant to be used with other providers like e.g.:
{@with fits_as_from_fitdeclarations::fits_name_from_fitdeclarations@} {@ …do stuff… @} {@endwith@}
- parse_hyperscan(hyperscan)[source]
A hyperscan in the hyperscan_results folder, containing at least one tries.json file
- parse_integdataset(integset: dict, *, theoryid, rules)[source]
An observable corresponding to a PDF in the evolution basis, used as integrability constrain in the fit. It is a mapping containing ‘dataset’ and ‘maxlambda’.
- parse_metadata_group(group: str)[source]
User specified key to group data by. The key must exist in the PLOTTING file for example experiment
- parse_norm_threshold(val: (<class 'numbers.Number'>, <class 'NoneType'>))[source]
The threshold to use for covariance matrix normalisation, sets the maximum l2 norm of the inverse covariance matrix, by clipping smallest eigenvalues
If norm_threshold is set to None, then no covmat regularization is performed
- parse_pdf(item, unpolarized_bc=None)[source]
A PDF set installed in LHAPDF. If an unpolarized boundary condition it defined, it will be registered as part of the PDF.
Either just an id (str), or a mapping with ‘id’ and ‘label’.
- parse_posdataset(posset: dict, *, theoryid, rules)[source]
An observable used as positivity constrain in the fit. It is a mapping containing ‘dataset’ and ‘maxlambda’.
- parse_reweighting_experiments(experiments, *, theoryid, use_cuts, fit=None)[source]
A list of experiments to be used for reweighting.
- parse_speclabel(label: (<class 'str'>, <class 'NoneType'>))[source]
A label for a dataspec. To be used in some plots
- parse_theoryid(item)[source]
A number corresponding to the database theory ID where the corresponding theory folder is installed in te data directory. Either just an id (str or int), or a mapping with ‘id’ and ‘label’.
- parse_unpolarized_bc(item)[source]
Unpolarised PDF used as a Boundary Condition to impose positivity of pPDFs. Either just an id , or a mapping with ‘id’ and ‘label’.
- parse_use_cuts(use_cuts: (<class 'bool'>, <class 'str'>))[source]
Whether to filter the points based on the cuts applied in the fit, or the whole data in the dataset. The possible options are:
internal: Calculate the cuts based on the existing rules. This is the default.
fromfit: Read the cuts stored in the fit.
nocuts: Use the whole dataset.
- parse_use_fitcommondata(do_use: bool)[source]
Use the commondata files in the fit instead of those in the data directory.
- parse_use_t0(do_use_t0: bool)[source]
Whether to use the t0 PDF set to generate covariance matrices.
- produce_basisfromfit(fit)[source]
Set the basis from fit config. In the fit config file the basis is set using the key
fitbasis
, but it is exposed to validphys asbasis
.The name of this production rule is intentionally set to not conflict with the existing
fitbasis
runcard key.
- produce_commondata(*, dataset_input, use_fitcommondata=False, fit=None)[source]
Produce a CommondataSpec from a dataset input
- produce_covariance_matrix(use_pdferr: bool = False)[source]
Modifies which action is used as covariance_matrix depending on the flag use_pdferr
- produce_covmat_t0_considered(use_t0: bool = False)[source]
Modifies which action is used as covariance_matrix depending on the flag use_t0
- produce_cuts(*, commondata, use_cuts)[source]
Obtain cuts for a given dataset input, based on the appropriate policy.
- produce_data(data_input, *, group_name='data')[source]
A set of datasets where correlated systematics are taken into account
- produce_data_input()[source]
Produce the
data_input
which is a flat list ofdataset_input
s. This production rule handles the backwards compatibility with old datasets which specifyexperiments
in the runcard.
- produce_dataset(*, dataset_input, theoryid, cuts, use_fitcommondata=False, fit=None, check_plotting: bool = False)[source]
Dataset specification from the theory and CommonData. Use the cuts from the fit, if provided. If check_plotting is set to True, attempt to lod and check the PLOTTING files (note this may cause a noticeable slowdown in general).
- produce_dataset_inputs_covariance_matrix(use_pdferr: bool = False)[source]
Modifies which action is used as experiment_covariance_matrix depending on the flag use_pdferr
- produce_dataset_inputs_covmat_t0_considered(use_t0: bool = False)[source]
Modifies which action is used as experiment_covariance_matrix depending on the flag use_t0
- produce_dataset_inputs_fitting_covmat(theory_covmat_flag=False, use_thcovmat_in_fitting=False)[source]
Produces the correct covmat to be used in fitting_data_dict according to some options: whether to include the theory covmat, whether to separate the multiplcative errors and whether to compute the experimental covmat using the t0 prescription.
- produce_dataset_inputs_sampling_covmat(sep_mult=False, theory_covmat_flag=False, use_thcovmat_in_sampling=False)[source]
Produces the correct covmat to be used in make_replica according to some options: whether to include the theory covmat and whether to separate the multiplcative errors.
- produce_dataspecs_with_matched_cuts(dataspecs)[source]
Take a list of namespaces (dataspecs), resolve
dataset
within each of them, and return another list of dataspecs where the datasets all have the same cuts, corresponding to the intersection of the selected points. All the datasets must have the same name (i.e. correspond with the same experimental measurement), but can otherwise differ, for example in the theory used for the experimental predictions.This rule can be combined with
matched_datasets_from_dataspecs
.
- produce_defaults(q2min=None, w2min=None, maxTau=None, default_filter_settings=None, filter_defaults=None, default_filter_settings_recorded_spec_=None)[source]
Produce default values for filters taking into account the values of
q2min
,w2min
andmaxTau
defined at namespace level and those inside afilter_defaults
mapping.Within this function the hashable type FilterDefaults is turned into a dictionary so as to allow for overwriting of the values of q2min, w2min and maxTau. The dictionary is then turned back into a FilterDefaults object.
- produce_experiment_from_input(experiment_input, theoryid, use_cuts, fit=None)[source]
Return a mapping containing a single experiment from an experiment input. NOTE: This might be deprecated in the future.
- produce_filter_data(fakedata: bool = False, theorycovmatconfig=None)[source]
Set the action used to filter the data to filter either real or closure data. If the closure data filter is being used and if the theory covariance matrix is not being closure tested then filter data by experiment for efficiency
- produce_fitcontext(fitinputcontext, fitpdf)[source]
Set PDF, theory ID and data input from the fit config
- produce_fitcontextwithcuts(fit, fitinputcontext)[source]
Like fitinputcontext but setting the cuts policy.
- produce_fitenvironment(fit, fitinputcontext)[source]
Like fitcontext, but additionally forcing various other parameters, such as the cuts policy and Monte Carlo seeding to be the same as the fit.
Notes
This production rule is designed to be used as a namespace to collect over, for use with
validphys.pseudodata.recreate_fit_pseudodata()
and can be added to freely, e.g by setting trvlseed to be from the fit runcard.
- produce_fitq0fromfit(fitinputcontext)[source]
Given a fit, return the fitting scale according to the theory
- produce_fitreplicas(fit)[source]
Production rule mapping the
replica
key to each Monte Carlo fit replica.
- produce_fitthcovmat(use_thcovmat_if_present: bool = False, fit: (<class 'str'>, <class 'NoneType'>) = None)[source]
If a fit is specified and use_thcovmat_if_present is True then returns the corresponding covariance matrix for the given fit if it exists. If the fit doesn’t have a theory covariance matrix then returns False.
- produce_fitunderlyinglaw(fit)[source]
Reads closuretest: fakepdf from fit config file and passes as pdf
- produce_group_dataset_inputs_by_metadata(data_input, processed_metadata_group)[source]
Take the data and the processed_metadata_group key and attempt to group the data, returns a list where each element specifies the data_input for a single group and the group_name
- produce_inclusive_use_scalevar_uncertainties(use_scalevar_uncertainties: bool = False, point_prescription: (<class 'str'>, None) = None)[source]
Whether to use a scale variation uncertainty theory covmat. Checks whether a point prescription is included in the runcard and if so assumes scale uncertainties are to be used.
- produce_loaded_theory_covmat(output_path, data_input, theory_covmat_flag=False, use_user_uncertainties=False, use_scalevar_uncertainties=True)[source]
Loads the theory covmat from the correct file according to how it was generated by vp-setupfit.
- produce_loaded_user_covmat_path(user_covmat_path: str = '', use_user_uncertainties: bool = False)[source]
Path to the user covmat provided by user_covmat_path in the runcard. If no path is provided, returns None. For use in theorycovariance.construction.user_covmat.
- produce_matched_datasets_from_dataspecs(dataspecs)[source]
Take an arbitrary list of mappings called dataspecs and return a new list of mappings called dataspecs constructed as follows.
From each of the original dataspecs, resolve the key process, and all the experiments and datasets therein.
Compute the intersection of the dataset names, and for each element in the intersection construct a mapping with the follwing keys:
process : A string with the common process name.
experiment_name : A string with the common experiment name.
dataset_name : A string with the common dataset name.
dataspecs : A list of mappinngs matching the original “dataspecs”. Each mapping contains:
dataset: A dataset with the name data_set name and the
properties (cuts, theory, etc) corresponding to the original dataspec. * dataset_input: The input line used to build dataset. * All the other keys in the original dataspec.
- produce_matched_positivity_from_dataspecs(dataspecs)[source]
Like produce_matched_datasets_from_dataspecs but for positivity datasets.
- produce_multiclosure_underlyinglaw(fits)[source]
Produce the underlying law for a set of fits. This allows a single t0 like covariance matrix to be loaded for all fits, for use with statistical estimators on multiple closure fits. If the fits don’t all have the same underlying law then an error is raised, offending fit is identified.
- produce_nnfit_theory_covmat(use_thcovmat_in_sampling: bool, use_thcovmat_in_fitting: bool, inclusive_use_scalevar_uncertainties, use_user_uncertainties: bool = False)[source]
Return the theory covariance matrix used in the fit.
- produce_no_covmat_reg()[source]
explicitly set norm_threshold to None so that no covariance matrix regularization is performed
- produce_pdfreplicas(fitpdf)[source]
Production rule mapping the
replica
key to each postfit replica.
- produce_processed_data_grouping(use_thcovmat_in_fitting=False, use_thcovmat_in_sampling=False, data_grouping=None, data_grouping_recorded_spec_=None)[source]
Process the data_grouping key from the runcard, or lockfile. If data_grouping_recorded_spec_ is present then its value is taken, and the runcard is assumed to be a lockfile.
If data_grouping is None, then, if either use_thcovmat_in_fitting or use_thcovmat_in_sampling (or both) are true (which means that the fit is a thcovmat fit), group all the datasets together, otherwise fall back to the default behaviour of grouping by experiment (called standard_report).
Else, the user can specfiy their own grouping, for example metadata_process.
- produce_processed_metadata_group(processed_data_grouping, metadata_group=None)[source]
Expose the final data grouping result. Either metadata_group is specified by user, in which case uses processed_data_grouping which is experiment by default.
- produce_rules(theoryid, use_cuts, defaults, default_filter_rules=None, filter_rules=None, default_filter_rules_recorded_spec_=None, added_filter_rules: (<class 'tuple'>, <class 'NoneType'>) = None)[source]
Produce filter rules based on the user defined input and defaults.
- produce_scale_variation_theories(theoryid, point_prescription)[source]
Produces a list of theoryids given a theoryid at central scales and a point prescription. The options for the latter are ‘3 point’, ‘5 point’, ‘5bar point’, ‘7 point’ and ‘9 point’. Note that these are defined in arXiv:1906.10698. This hard codes the theories needed for each prescription to avoid user error.
- produce_t0set(t0pdfset=None, use_t0=False)[source]
Return the t0set if use_t0 is True and None otherwise. Raises an error if t0 is requested but no t0set is given.
validphys.convolution module
This module implements tools for computing convolutions between PDFs and theory grids, which yield observables.
The high level predictions()
function can be used to extact theory
predictions for experimentally measured quantities:
import numpy as np
from validphys.api import API
from validphys.convolution import predictions
inp = {
'fit': '181023-001-sc',
'use_cuts': 'internal',
'theoryid': 162,
'pdf': 'NNPDF40_nnlo_lowprecision',
'dataset_inputs': {'from_': 'fit'}
}
all_datasets = API.data(**inp).datasets
pdf = API.pdf(**inp)
all_preds = [predictions(ds, pdf) for ds in all_datasets]
Some variants such as central_predictions()
and
linear_predictions()
are useful for more specialized tasks.
These functions work with validphys.core.DatasetSpec
objects,
allowing to account for information on COMPOUND predictions and cuts. A lower
level interface which operates with validphys.coredata.FKTableData
objects is also available.
- validphys.convolution.central_dis_predictions(loaded_fk, pdf)[source]
Implementation of
central_fk_predictions()
for DIS observables.
- validphys.convolution.central_fk_predictions(loaded_fk, pdf)[source]
Same as
fk_predictions()
, but computing predictions for the central PDF member only.
- validphys.convolution.central_hadron_predictions(loaded_fk, pdf)[source]
Implementation of
central_fk_predictions()
for hadronic observables.
- validphys.convolution.central_predictions(dataset, pdf)[source]
Same as
predictions()
but computing the predictions for the central member of the PDF set only. For Monte Carlo PDFs, this is a faster alternative to computing the central predictions as the average of the replica predictions (although a small approximation is involved in the case of hadronic predictions).
- validphys.convolution.dis_predictions(loaded_fk, pdf)[source]
Implementation of
fk_predictions()
for DIS observables.
- validphys.convolution.fk_predictions(loaded_fk, pdf)[source]
Low level function to compute predictions from a FKTable.
- Parameters
loaded_fk (validphys.coredata.FKTableData) – The FKTable corresponding to the partonic cross section.
pdf (validphys.core.PDF) – The PDF set to use for the convolutions.
- Returns
df – A dataframe corresponding to the hadronic prediction for each data point for the PDF members. The index of the dataframe corresponds to the selected data points (use
validphys.coredata.FKTableData.with_cuts()
to filter out points). The columns correspond to the selected PDF members in the LHAPDF set.- Return type
pandas.DataFrame
Notes
This function operates on a single FKTable, while the prediction for an experimental quantity generally involves several. Use
predictions()
to compute those.Examples
>>> from validphys.loader import Loader >>> from validphys.convolution import hadron_predictions >>> from validphys.fkparser import load_fktable >>> l = Loader() >>> pdf = l.check_pdf('NNPDF31_nnlo_as_0118') >>> ds = l.check_dataset('ATLASTTBARTOT', theoryid=53, cfac=('QCD',)) >>> table = load_fktable(ds.fkspecs[0]) >>> hadron_predictions(table, pdf) 1 2 3 4 ... 97 98 99 100 data ... 0 176.688118 170.172930 172.460771 173.792321 ... 179.504636 172.343792 168.372508 169.927820 1 252.682923 244.507916 247.840249 249.541798 ... 256.410844 247.805180 242.246438 244.415529 2 828.076008 813.452551 824.581569 828.213508 ... 838.707211 826.056388 810.310109 816.824167
- validphys.convolution.hadron_predictions(loaded_fk, pdf)[source]
Implementation of
fk_predictions()
for hadronic observables.
- validphys.convolution.linear_fk_predictions(loaded_fk, pdf)[source]
Same as
predictions()
for DIS, but compute linearized predictions for hadronic data, usinglinear_hadron_predictions()
.
- validphys.convolution.linear_hadron_predictions(loaded_fk, pdf)[source]
Implementation of
linear_fk_predictions()
for hadronic observables. Specifically this computes:central_value ⊗ FK ⊗ (2 * replica_values - central_value)
which is the linear expansion of the hadronic observable in the difference between each replica and the central value,
replica_values - central_value
- validphys.convolution.linear_predictions(dataset, pdf)[source]
Same as
predictions()
but computing linearized predictions. These are the same aspredictions
for DIS, but truncates to the terms that are linear in the difference between each member and the central value for hadronic predictions.This approximation is generally a very good approximation in that yields differences that are much smaller that the PDF uncertainty.
- validphys.convolution.predictions(dataset, pdf)[source]
“Compute theory predictions for a given PDF and dataset. Information regading the dataset, on cuts, CFactors and combinations of FKTables is taken into account to construct the predictions.
The result should be comparable to experimental predictions implemented in CommonData.
- Parameters
dataset (validphys.core.DatasetSpec) – The dataset containing information on the partonic cross section.
pdf (validphys.core.PDF) – The PDF set to use for the convolutions.
- Returns
df – A dataframe corresponding to the hadronic prediction for each data point for the PDF members. The index of the dataframe corresponds to the selected data points, based on the dataset cuts. The columns correspond to the selected PDF members in the LHAPDF set.
- Return type
pandas.DataFrame
Examples
Obtain descriptive statistics over PDF replicas for each of the three points in the ATLAS ttbar dataset:
>>> from validphys.loader import Loader >>> l = Loader() >>> ds = l.check_dataset('ATLASTTBARTOT', theoryid=53) >>> from validphys.convolution import predictions >>> pdf = l.check_pdf('NNPDF31_nnlo_as_0118') >>> preds = predictions(ds, pdf) >>> preds.T.describe() data 0 1 2 count 100.000000 100.000000 100.000000 mean 161.271292 231.500367 767.816844 std 2.227304 2.883497 7.327617 min 156.638526 225.283254 750.850250 25% 159.652216 229.486793 762.773527 50% 161.066965 231.281248 767.619249 75% 162.620554 233.306836 772.390286 max 168.390840 240.287549 786.549380
validphys.core module
Core datastructures used in the validphys data model.
- class validphys.core.CommonDataSpec(name, metadata, legacy=False, datafile=None, sysfile=None, plotfiles=None)[source]
Bases:
TupleComp
Holds all the information necessary to load a commondata file and provides methods to easily access them
- Parameters
name (str) – name of the commondata
metadata (ObservableMetaData) – instance of ObservableMetaData holding all information about the dataset
legacy (bool) – whether this is an old or new format metadata file
The
datafile
,sysfile
and plotfiles` arguments are deprecated and only to be used withlegacy=True
- property legacy_name
- property metadata
- property name
- property ndata
- property nsys
- property plot_kinlabels
- property process_type
- property theory_metadata
- class validphys.core.CutsPolicy(value)[source]
Bases:
Enum
An enumeration.
- FROMFIT = 'fromfit'
- FROM_CUT_INTERSECTION_NAMESPACE = 'fromintersection'
- FROM_SIMILAR_PREDICTIONS_NAMESPACE = 'fromsimilarpredictions'
- INTERNAL = 'internal'
- NOCUTS = 'nocuts'
- class validphys.core.DataGroupSpec(name, datasets, dsinputs=None)[source]
Bases:
TupleComp
,NSList
- property as_markdown
- load_commondata_instance()[source]
Given Experiment load list of validphys.coredata.CommonData objects with cuts already applied
- property thspec
- class validphys.core.DataSetInput(*, name, sys, cfac, frac, weight, custom_group, variant)[source]
Bases:
TupleComp
Represents whatever the user enters in the YAML to specify a dataset.
- class validphys.core.DataSetSpec(*, name, commondata, fkspecs, thspec, cuts, frac=1, op=None, weight=1, rules=())[source]
Bases:
TupleComp
- class validphys.core.FKTableSpec(fkpath, cfactors, metadata=None)[source]
Bases:
TupleComp
Each FKTable is formed by a number of sub-fktables to be concatenated each of which having its own path. Therefore the
fkpath
variable is a list of paths.Before the pineappl implementation, FKTable were already pre-concatenated. The Legacy interface therefore relies on fkpath being just a string or path instead
The metadata of the FKTable for the given dataset is stored as an attribute to this function. This is transitional, eventually it will be held by the associated CommonData in the new format.
- class validphys.core.HessianStats(data, rescale_factor=1)[source]
Bases:
SymmHessianStats
Compute stats in the ‘assymetric’ hessian format: The first index (0) is the central value. The odd indexes are the results for lower eigenvectors and the even are the upper eigenvectors.A ‘rescale_factor is allowed in case the eigenvector confidence interval is not 68%’.
- class validphys.core.HyperscanSpec(name, path)[source]
Bases:
FitSpec
The hyperscan spec is just a special case of FitSpec
- get_all_trials(base_params=None)[source]
Read all trials from all tries files. If there are original runcard-based parameters, a reference to them can be passed to the trials so that a full hyperparameter dictionary can be defined
Each hyperopt trial object will also have a reference to all trials in its own file
- label
- name
- path
- sample_trials(n=None, base_params=None, sigma=4.0)[source]
Parse all trials in the hyperscan object and then return an array of
n
trials read from thetries.json
files and sampled according to their reward. Ifn
isNone
, no sapling is performed and all trials are returned- Returns
Dictionary on the form {parameters
- Return type
list of trials}
- property tries_files
Return a dictionary with all tries.json files mapped to their replica number
- class validphys.core.IntegrabilitySetSpec(name, commondataspec, fkspec, maxlambda, thspec, rules)[source]
Bases:
LagrangeSetSpec
- class validphys.core.LagrangeSetSpec(name, commondataspec, fkspec, maxlambda, thspec, rules)[source]
Bases:
DataSetSpec
Extends DataSetSpec to work around the particularities of the positivity, integrability and other Lagrange Multiplier datasets.
- class validphys.core.PDF(name, boundary=None)[source]
Bases:
TupleComp
Base validphys PDF providing high level access to metadata.
Statistical estimators which depends on the PDF type (MC, Hessian…) are exposed as a
Stats
object through thestats_class
attribute The LHAPDF metadata can directly be accessed through theinfo
attributeExamples
>>> from validphys.api import API >>> from validphys.convolution import predictions >>> args = {"dataset_input":{"dataset": "ATLASTTBARTOT"}, "theoryid":162, "use_cuts":"internal"} >>> ds = API.dataset(**args) >>> pdf = API.pdf(pdf="NNPDF40_nnlo_as_01180") >>> preds = predictions(ds, pdf) >>> preds.shape (3, 100)
- property alphas_mz
Alpha_s(M_Z) as defined in the LHAPDF .info file
- property alphas_vals
List of alpha_s(Q) at various Q for interpolation based alphas. Values as defined in the LHAPDF .info file
- property error_conf_level
Error confidence level as defined in the LHAPDF .info file if no number is given in the LHAPDF .info file defaults to 68%
- property error_type
Error type as defined in the LHAPDF .info file
- property info
Information contained in the LHAPDF .info file
- property infopath
- property is_polarized
Returns
True
if the PDF has a boundary condition associated to it. At the moment LHAPDF provides no mechanism to know whether a PDF is polarized.
- property isinstalled
- property label
- property q_min
Minimum Q as given by the LHAPDF .info file
- register_boundary(unpolarized_bc=None)[source]
Register other PDFs as boundary conditions of this PDF
- property stats_class
Return the stats calculator for this error type
- class validphys.core.PDFcv(name, boundary=None)[source]
Bases:
PDF
An add-on for the PDF class that makes only the central value available
- class validphys.core.PositivitySetSpec(name, commondataspec, fkspec, maxlambda, thspec, rules)[source]
Bases:
LagrangeSetSpec
- class validphys.core.Stats(data)[source]
Bases:
object
Class holding statistical information about the objects used in validphys. This object can be a PDF or any function of a PDF (such as hadronic observable).
By convention, member 0 corresponds to the central value of the PDF. Accordingly, the method
central_value
will return the result held for member 0. Note that this is equal to the mean of theerror_members
only for the PDF itself and linear functions of the PDF (such as DIS-type observable). If you want to obtain the average of the error members you can do:np.mean(stats_instance.error_members, axis=0)
- class validphys.core.SymmHessianStats(data, rescale_factor=1)[source]
Bases:
Stats
Compute stats in the ‘symetric’ hessian format: The first index (0) is the central value. The rest of the indexes are results for each eigenvector. A ‘rescale_factor is allowed in case the eigenvector confidence interval is not 68%’.
- class validphys.core.TheoryIDSpec(id: int, path: pathlib.Path, dbpath: pathlib.Path)[source]
Bases:
object
validphys.coredata module
Data containers backed by Python managed memory (Numpy arrays and Pandas dataframes).
- class validphys.coredata.CFactorData(description: str, central_value: array, uncertainty: array)[source]
Bases:
object
Data contained in a CFactor
- Parameters
description (str) – Information on how the data was obtained.
central_value (array, shape(ndata)) – The value of the cfactor for each data point.
uncertainty (array, shape(ndata)) – The absolute uncertainty on the cfactor if available.
- central_value: array
- uncertainty: array
- class validphys.coredata.CommonData(setname: str, ndata: int, commondataproc: str, nkin: int, nsys: int, commondata_table: DataFrame, systype_table: DataFrame, legacy: bool, legacy_name: Optional[str] = None, kin_variables: Optional[list] = None)[source]
Bases:
object
Data contained in Commondata files, relevant cuts applied.
- Parameters
setname (str) – Name of the dataset
ndata (int) – Number of data points
commondataproc (str) – Process type, one of 21 options
nkin (int) – Number of kinematics specified
nsys (int) – Number of systematics
commondata_table (pd.DataFrame) – Pandas dataframe containing the commondata
systype_table (pd.DataFrame) – Pandas dataframe containing the systype index for each systematic alongside the uncertainty type (ADD/MULT/RAND) and name (CORR/UNCORR/THEORYCORR/SKIP)
systematics_table (pd.DataFrame) – Panda dataframe containing the table of systematics
- property additive_errors
Returns the systematics which are additive (systype is ADD) as absolute uncertainties (same units as data), with SKIP uncertainties removed.
- property central_values
- commondata_table: DataFrame
- export(folder_path)[source]
Wrapper around export_data and export_uncertainties to write both uncertainties and data after filtering to a given folder
- export_data(buffer)[source]
Exports the central data defined by this commondata instance to the given buffer
- export_uncertainties(buffer)[source]
Exports the uncertainties defined by this commondata instance to the given buffer
- property kinematics
- property multiplicative_errors
Returns the systematics which are multiplicative (systype is MULT) in a percentage format, with SKIP uncertainties removed.
- property stat_errors
- systematic_errors(central_values=None)[source]
Returns all systematic errors as absolute uncertainties, with a single column for each uncertainty. Converts
multiplicative_errors
to units of data and then appends ontoadditive_errors
. By default uses the experimental central values to perform conversion, but the user can supply a 1-D array of central values, with lengthself.ndata
, to use instead of the experimental central values to calculate the absolute contribution of the multiplicative systematics.- Parameters
central_values (None, np.array) – 1-D array containing alternative central values to combine with multiplicative uncertainties. This array must have length equal to
self.ndata
. By defaultcentral_values
is None, and the central values of the commondata are used.- Returns
systematic_errors – Dataframe containing systematic errors.
- Return type
pd.DataFrame
- systype_table: DataFrame
- class validphys.coredata.FKTableData(hadronic: bool, Q0: float, ndata: int, xgrid: ~numpy.ndarray, sigma: ~pandas.core.frame.DataFrame, convolution_types: ~typing.Optional[tuple[str]] = None, metadata: dict = <factory>, protected: bool = False)[source]
Bases:
object
Data contained in an FKTable
- Parameters
hadronic (bool) – Whether a hadronic (two PDFs) or a DIS (one PDF) convolution is needed.
Q0 (float) – The scale at which the PDFs should be evaluated (in GeV).
ndata (int) – The number of data points in the grid.
xgrid (array, shape (nx)) – The points in x at which the PDFs should be evaluated.
sigma (pd.DataFrame) –
For hadronic data, the columns are the indexes in the
NfxNf
list of possible flavour combinations of two PDFs. The MultiIndex contains three keys, the data index, an index intoxgrid
for the first PDF and an idex intoxgrid
for the second PDF, indicating if the points inx
where the PDF should be evaluated.For DIS data, the columns are indexes in the
Nf
list of flavours. The MultiIndex contains two keys, the data index and an index intoxgrid
indicating the points inx
where the PDF should be evaluated.convolution_types (tuple[str]) – The type of convolution that the FkTable is expecting for each of the functions to be convolved with (usually the two types of PDF from the two incoming hadrons).
metadata (dict) – Other information contained in the FKTable.
protected (bool) – When a fktable is protected cuts will not be applied. The most common use-case is when a total cross section is used as a normalization table for a differential cross section, in legacy code (<= NNPDF4.0) both fktables would be cut using the differential index.
- determine_pdfs(pdf)[source]
Determine the PDF (or PDFs) that should be used to be convoluted with this fktable. Uses the convolution_types key to decide the PDFs. If convolution_types is not defined, it returns the pdf object.
- get_np_fktable()[source]
Returns the fktable as a dense numpy array that can be directly manipulated with numpy
- The return shape is:
(ndata, nx, nbasis) for DIS (ndata, nx, nx, nbasis) for hadronic
where nx is the length of the xgrid and nbasis the number of flavour contributions that contribute
- property luminosity_mapping
Return the flavour combinations that contribute to the fktable in the form of a single array
- The return shape is:
(nbasis,) for DIS (nbasis*2,) for hadronic
- sigma: DataFrame
- with_cfactor(cfactor)[source]
Returns a copy of the FKTableData object with cfactors applied to the fktable
- with_cuts(cuts)[source]
Return a copy of the FKTable with the cuts applied. The data index of the sigma operator (the outermost level), contains the data point that have been kept. The ndata property is updated to reflect the new number of datapoints. If cuts is None, return the object unmodified.
- Parameters
cuts (array_like or validphys.core.Cuts or None.) – The cuts to be applied.
- Returns
res – A copy of the FKtable with the cuts applies.
- Return type
Notes
The original number of points can be accessed with
table.metadata['GridInfo'].ndata
.Examples
>>> from validphys.fkparser import load_fktable ... from validphys.loader import Loader ... l = Loader() ... ds = l.check_dataset('ATLASTTBARTOT', theoryid=53, cfac=('QCD',)) ... table = load_fktable(ds.fkspecs[0]) ... newtable = table.with_cuts([0,1]) >>> assert set(newtable.sigma.index.get_level_values(0)) == {0,1} >>> assert newtable.ndata == 2 >>> assert newtable.metadata['GridInfo'].ndata == 3
- xgrid: ndarray
validphys.correlations module
Utilities for computing correlations in batch.
@author: Zahari Kassabov
validphys.covmats module
Module for handling logic and manipulation of covariance and correlation matrices on different levels of abstraction
- validphys.covmats.covmat_from_systematics(loaded_commondata_with_cuts, dataset_input, use_weights_in_covmat=True, norm_threshold=None, _central_values=None)[source]
Take the statistical uncertainty and systematics table from a
validphys.coredata.CommonData
object and construct the covariance matrix accounting for correlations between systematics.If the systematic has the name
SKIP
then it is ignored in the construction of the covariance matrix.ADDitive or MULTiplicative systypes are handled by either multiplying the additive or multiplicative uncertainties respectively. We convert uncertainties so that they are all in the same units as the data:
Additive (ADD) systematics are left unchanged
multiplicative (MULT) systematics need to be converted from a
percentage by multiplying by the central value and dividing by 100.
Finally, the systematics are split into the five possible archetypes of systematic uncertainties: uncorrelated (UNCORR), correlated (CORR), theory uncorrelated (THEORYUNCORR), theory correlated (THEORYCORR) and special correlated (SPECIALCORR) systematics.
Uncorrelated contributions from statistical error, uncorrelated and theory uncorrelated are added in quadrature to the diagonal of the covmat.
The contribution to the covariance matrix arising due to correlated systematics is schematically
A_correlated @ A_correlated.T
, where A_correlated is a matrix N_dat by N_sys. The total contribution from correlated systematics is found by adding together the result of mutiplying each correlated systematic matrix by its transpose (correlated, theory_correlated and special_correlated).For more information on the generation of the covariance matrix see the paper outlining the procedure, specifically equation 2 and surrounding text.
- Parameters
loaded_commondata_with_cuts (validphys.coredata.CommonData) – CommonData which stores information about systematic errors, their treatment and description.
dataset_input (validphys.core.DataSetInput) – Dataset settings, contains the weight for the current dataset. The returned covmat will be divided by the dataset weight if
use_weights_in_covmat
. The default weight is 1, which means the returned covmat will be unmodified.use_weights_in_covmat (bool) – Whether to weight the covmat, True by default.
norm_threshold (number) – threshold used to regularize covariance matrix
_central_values (None, np.array) – 1-D array containing alternative central values to combine with the multiplicative errors to calculate their absolute contributions. By default this is None, and the experimental central values are used. However, this can be used to calculate, for example, the t0 covariance matrix by using the predictions from the central member of the t0 pdf.
- Returns
cov_mat – Numpy array which is N_dat x N_dat (where N_dat is the number of data points after cuts) containing uncertainty and correlation information.
- Return type
np.array
Example
In order to use this function, simply call it from the API
>>> from validphys.api import API >>> inp = dict( ... dataset_input={'dataset': 'CMSZDIFF12', 'cfac':('QCD', 'NRM'), 'sys':10}, ... theoryid=162, ... use_cuts="internal" ... ) >>> cov = API.covmat_from_systematics(**inp) >>> cov.shape (28, 28)
- validphys.covmats.covmat_stability_characteristic(systematics_matrix_from_commondata)[source]
Return a number characterizing the stability of an experimental covariance matrix against uncertainties in the correlation. It is defined as the L2 norm (largest singular value) of the square root of the inverse correlation matrix. This is equivalent to the square root of the inverse of the smallest singular value of the correlation matrix:
Z = (1/λ⁰)^½
Where λ⁰ is the smallest eigenvalue of the correlation matrix.
This is the number used as threshold in
calcutils.regularize_covmat()
. The interpretation is roughly what precision does the worst correlation need to have in order to not affect meaningfully the χ² computed using the covariance matrix, so for example a stability characteristic of 4 means that correlations need to be known with uncetainties less than 0.25.Examples
>>> from validphys.api import API >>> API.covmat_stability_characteristic(dataset_input={"dataset": "NMC"}, ... theoryid=162, use_cuts="internal") 2.742658604186114
- validphys.covmats.dataset_inputs_covmat_from_systematics(dataset_inputs_loaded_cd_with_cuts, data_input, use_weights_in_covmat=True, norm_threshold=None, _list_of_central_values=None, _only_additive=False)[source]
Given a list containing
validphys.coredata.CommonData
s, construct the full covariance matrix.This is similar to
covmat_from_systematics()
except that special corr systematics are concatenated across all datasets before being multiplied by their transpose to give off block-diagonal contributions. The other systematics contribute to the block diagonal in the same way ascovmat_from_systematics()
.- Parameters
dataset_inputs_loaded_cd_with_cuts (list[validphys.coredata.CommonData]) – list of CommonData objects.
data_input (list[validphys.core.DataSetInput]) – Settings for each dataset, each element contains the weight for the current dataset. The elements of the returned covmat for dataset i and j will be divided by sqrt(weight_i)*sqrt(weight_j), if
use_weights_in_covmat
. The default weight is 1, which means the returned covmat will be unmodified.use_weights_in_covmat (bool) – Whether to weight the covmat, True by default.
norm_threshold (number) – threshold used to regularize covariance matrix
_list_of_central_values (None, list[np.array]) – list of 1-D arrays which contain alternative central values which are combined with the multiplicative errors to calculate their absolute contribution. By default this is None and the experimental central values are used.
- Returns
cov_mat – Numpy array which is N_dat x N_dat (where N_dat is the number of data points after cuts) containing uncertainty and correlation information.
- Return type
np.array
Example
This function can be called directly from the API:
>>> dsinps = [ ... {'dataset': 'NMC'}, ... {'dataset': 'ATLASTTBARTOT', 'cfac':['QCD']}, ... {'dataset': 'CMSZDIFF12', 'cfac':('QCD', 'NRM'), 'sys':10} ... ] >>> inp = dict(dataset_inputs=dsinps, theoryid=162, use_cuts="internal") >>> cov = API.dataset_inputs_covmat_from_systematics(**inp) >>> cov.shape (235, 235)
Which properly accounts for all dataset settings and cuts.
- validphys.covmats.dataset_inputs_exp_covmat(dataset_inputs_loaded_cd_with_cuts, *, data_input, use_weights_in_covmat=True, norm_threshold=None)[source]
Function to compute the covmat to be used for the sampling by make_replica and for the chi2 by fitting_data_dict. In this case the t0 prescription is not used for the experimental covmat and the multiplicative errors are included in it.
- validphys.covmats.dataset_inputs_exp_covmat_separate(dataset_inputs_loaded_cd_with_cuts, *, data_input, use_weights_in_covmat=True, norm_threshold=None)[source]
Function to compute the covmat to be used for the sampling by make_replica. In this case the t0 prescription is not used for the experimental covmat and the multiplicative errors are separated.
- validphys.covmats.dataset_inputs_sqrt_covmat(dataset_inputs_covariance_matrix)[source]
Like sqrt_covmat but for an group of datasets
- validphys.covmats.dataset_inputs_stability_table(dataset_inputs_stability, dataset_inputs)[source]
Return a table with py:func:covmat_stability_characteristic for all dataset inputs
- validphys.covmats.dataset_inputs_t0_covmat_from_systematics(dataset_inputs_loaded_cd_with_cuts, *, data_input, use_weights_in_covmat=True, norm_threshold=None, dataset_inputs_t0_predictions)[source]
Like
t0_covmat_from_systematics()
except for all data- Parameters
dataset_inputs_loaded_cd_with_cuts (list[validphys.coredata.CommonData]) – The CommonData for all datasets defined in
dataset_inputs
.data_input (list[validphys.core.DataSetInput]) – Settings for each dataset, each element contains the weight for the current dataset. The elements of the returned covmat for dataset i and j will be divided by sqrt(weight_i)*sqrt(weight_j), if
use_weights_in_covmat
. The default weight is 1, which means the returned covmat will be unmodified.use_weights_in_covmat (bool) – Whether to weight the covmat, True by default.
dataset_inputs_t0_predictions (list[np.array]) – The t0 predictions for all datasets.
- Returns
t0_covmat – t0 covariance matrix matrix for list of datasets.
- Return type
np.array
- validphys.covmats.dataset_inputs_t0_exp_covmat(dataset_inputs_loaded_cd_with_cuts, *, data_input, use_weights_in_covmat=True, norm_threshold=None, dataset_inputs_t0_predictions)[source]
Function to compute the covmat to be used for the sampling by make_replica and for the chi2 by fitting_data_dict. In this case the t0 prescription is used for the experimental covmat and the multiplicative errors are included in it.
- validphys.covmats.dataset_inputs_t0_exp_covmat_separate(dataset_inputs_loaded_cd_with_cuts, *, data_input, use_weights_in_covmat=True, norm_threshold=None, dataset_inputs_t0_predictions)[source]
Function to compute the covmat to be used for the sampling by make_replica. In this case the t0 prescription is used for the experimental covmat and the multiplicative errors are separated.
- validphys.covmats.dataset_inputs_t0_total_covmat(dataset_inputs_t0_exp_covmat, loaded_theory_covmat)[source]
Function to compute the covmat to be used for the sampling by make_replica and for the chi2 by fitting_data_dict. In this case the t0 prescription is used for the experimental covmat and the multiplicative errors are included in it. Moreover, the theory covmat is added to experimental covmat.
- validphys.covmats.dataset_inputs_t0_total_covmat_separate(dataset_inputs_t0_exp_covmat_separate, loaded_theory_covmat)[source]
Function to compute the covmat to be used for the sampling by make_replica. In this case the t0 prescription is used for the experimental covmat and the multiplicative errors are separated. Moreover, the theory covmat is added to experimental covmat.
- validphys.covmats.dataset_inputs_total_covmat(dataset_inputs_exp_covmat, loaded_theory_covmat)[source]
Function to compute the covmat to be used for the sampling by make_replica and for the chi2 by fitting_data_dict. In this case the t0 prescription is not used for the experimental covmat and the multiplicative errors are included in it. Moreover, the theory covmat is added to experimental covmat.
- validphys.covmats.dataset_inputs_total_covmat_separate(dataset_inputs_exp_covmat_separate, loaded_theory_covmat)[source]
Function to compute the covmat to be used for the sampling by make_replica. In this case the t0 prescription is not used for the experimental covmat and the multiplicative errors are separated. Moreover, the theory covmat is added to experimental covmat.
- validphys.covmats.dataset_t0_predictions(dataset, t0set)[source]
Returns the t0 predictions for a
dataset
which are the predictions calculated using the central member ofpdf
. Note that ifpdf
has errortypereplicas
, and the dataset is a hadronic observable then the predictions of the central member are subtly different to the central value of the replica predictions.- Parameters
dataset (validphys.core.DataSetSpec) – dataset for which to calculate t0 predictions
t0set (validphys.core.PDF) – pdf used to calculate the predictions
- Returns
t0_predictions – 1-D numpy array with predictions for each of the cut datapoints.
- Return type
np.array
- validphys.covmats.datasets_covmat_differences_table(each_dataset, datasets_covmat_no_reg, datasets_covmat_reg, norm_threshold)[source]
For each dataset calculate and tabulate two max differences upon regularization given a value for norm_threshold:
max relative difference to the diagonal of the covariance matrix (%)
max absolute difference to the correlation matrix of each covmat
- validphys.covmats.dataspecs_datasets_covmat_differences_table(dataspecs_speclabel, dataspecs_covmat_diff_tables)[source]
For each dataspec calculate and tabulate the two covmat differences described in datasets_covmat_differences_table (max relative difference in variance and max absolute correlation difference)
- validphys.covmats.fit_name_with_covmat_label(fit, fitthcovmat)[source]
If theory covariance matrix is being used to calculate statistical estimators for the fit then appends (exp + th) onto the fit name for use in legends and column headers to help the user see what covariance matrix was used to produce the plot or table they are looking at.
- validphys.covmats.generate_exp_covmat(datasets_input, data, use_weights, norm_threshold, _list_of_c_values, only_add)[source]
Function to generate the experimental covmat eventually using the t0 prescription. It is also possible to compute it only with the additive errors.
- Parameters
dataset_inputs (list[validphys.coredata.CommonData]) – list of CommonData objects.
data (list[validphys.core.DataSetInput]) – Settings for each dataset, each element contains the weight for the current dataset. The elements of the returned covmat for dataset i and j will be divided by sqrt(weight_i)*sqrt(weight_j), if
use_weights_in_covmat
. The default weight is 1, which means the returned covmat will be unmodified.use_weights (bool) – Whether to weight the covmat, True by default.
norm_threshold (number) – threshold used to regularize covariance matrix
_list_of_c_values (None, list[np.array]) – list of 1-D arrays which contain alternative central values which are combined with the multiplicative errors to calculate their absolute contribution. By default this is None and the experimental central values are used.
only_add (bool) – specifies whether to use only the additive errors to compute the covmat
- Returns
np.array
experimental covariance matrix
- validphys.covmats.groups_corrmat(groups_covmat)[source]
Generates the grouped experimental correlation matrix with groups_covmat as input
- validphys.covmats.groups_covmat(groups_covmat_no_table)[source]
Duplicate of groups_covmat_no_table but with a table decorator.
- validphys.covmats.groups_covmat_no_table(groups_data, groups_index, groups_covmat_collection)[source]
Export the covariance matrix for the groups. It exports the full (symmetric) matrix, with the 3 first rows and columns being:
group name
dataset name
index of the point within the dataset.
- validphys.covmats.groups_invcovmat(groups_data, groups_index, groups_covmat_collection)[source]
Compute and export the inverse covariance matrix. Note that this inverts the matrices with the LU method which is suboptimal.
- validphys.covmats.groups_normcovmat(groups_covmat, groups_data_values)[source]
Calculates the grouped experimental covariance matrix normalised to data.
- validphys.covmats.groups_sqrtcovmat(groups_data, groups_index, groups_sqrt_covmat)[source]
Like groups_covmat, but dump the lower triangular part of the Cholesky decomposition as used in the fit. The upper part indices are set to zero.
- validphys.covmats.pdferr_plus_covmat(results_without_covmat, pdf, covmat_t0_considered)[source]
For a given dataset, returns the sum of the covariance matrix given by covmat_t0_considered and the PDF error: - If the PDF error_type is ‘replicas’, a covariance matrix is estimated from
the replica theory predictions
If the PDF error_type is ‘symmhessian’, a covariance matrix is estimated using formulas from (mc2hessian) https://arxiv.org/pdf/1505.06736.pdf
If the PDF error_type is ‘hessian’ a covariance matrix is estimated using the hessian formula from Eq. 5 of https://arxiv.org/pdf/1401.0013.pdf
- Parameters
dataset (DataSetSpec) – object parsed from the dataset_input runcard key
pdf (PDF) – monte carlo pdf used to estimate PDF error
covmat_t0_considered (np.array) – experimental covariance matrix with the t0 considered
- Returns
covariance_matrix – sum of the experimental and pdf error as a numpy array
- Return type
np.array
Examples
use_pdferr makes this action be used for covariance_matrix
>>> from validphys.api import API >>> import numpy as np >>> inp = { 'dataset_input': { 'dataset': 'ATLAS_TTBAR_8TEV_LJ_DIF_YTTBAR-NORM', 'variant': 'legacy', }, 'theoryid': 700, 'pdf': 'NNPDF40_nlo_as_01180', 'use_cuts': 'internal', } >>> a = API.covariance_matrix(**inp, use_pdferr=True) >>> b = API.pdferr_plus_covmat(**inp) >>> (a == b).all() True
- validphys.covmats.pdferr_plus_dataset_inputs_covmat(dataset_inputs_results_without_covmat, data, pdf, dataset_inputs_covmat_t0_considered, fitthcovmat)[source]
Like pdferr_plus_covmat except for an experiment
- validphys.covmats.reorder_thcovmat_as_expcovmat(fitthcovmat, data)[source]
Reorder the thcovmat in such a way to match the order of the experimental covmat, which means the order of the runcard
- validphys.covmats.sqrt_covmat(covariance_matrix)[source]
Function that computes the square root of the covariance matrix.
- Parameters
covariance_matrix (np.array) – A positive definite covariance matrix, which is N_dat x N_dat (where N_dat is the number of data points after cuts) containing uncertainty and correlation information.
- Returns
sqrt_mat – The square root of the input covariance matrix, which is N_dat x N_dat (where N_dat is the number of data points after cuts), and which is the the lower triangular decomposition. The following should be
True
:np.allclose(sqrt_covmat @ sqrt_covmat.T, covariance_matrix)
.- Return type
np.array
Notes
The square root is found by using the Cholesky decomposition. However, rather than finding the decomposition of the covariance matrix directly, the (upper triangular) decomposition is found of the corresponding correlation matrix and then the output of this is rescaled and then transposed as
sqrt_matrix = (decomp * sqrt_diags).T
, wheredecomp
is the Cholesky decomposition of the correlation matrix andsqrt_diags
is the square root of the diagonal entries of the covariance matrix. This method is useful in situations in which the covariance matrix is near-singular. See here for more discussion on this.The lower triangular is useful for efficient calculation of the \(\chi^2\)
Example
>>> import numpy as np >>> from validphys.api import API >>> API.sqrt_covmat(dataset_input={"dataset":"NMC"}, theoryid=162, use_cuts="internal") array([[0.0326543 , 0. , 0. , ..., 0. , 0. , 0. ], [0.00314523, 0.01467259, 0. , ..., 0. , 0. , 0. ], [0.0037817 , 0.00544256, 0.02874822, ..., 0. , 0. , 0. ], ..., [0.00043404, 0.00031169, 0.00020489, ..., 0.00441073, 0. , 0. ], [0.00048717, 0.00033792, 0.00022971, ..., 0.00126704, 0.00435696, 0. ], [0.00067353, 0.00050372, 0.0003203 , ..., 0.00107255, 0.00065041, 0.01002952]]) >>> sqrt_cov = API.sqrt_covmat(dataset_input={"dataset":"NMC"}, theoryid=162, use_cuts="internal") >>> cov = API.covariance_matrix(dataset_input={"dataset":"NMC"}, theoryid=162, use_cuts="internal") >>> np.allclose(np.linalg.cholesky(cov), sqrt_cov) True
- validphys.covmats.systematics_matrix_from_commondata(loaded_commondata_with_cuts, dataset_input, use_weights_in_covmat=True, _central_values=None)[source]
Returns a systematics matrix, \(A\), for the corresponding dataset. The systematics matrix is a square root of the covmat:
\[C = A A^T\]and is obtained by concatenating a block diagonal of the uncorrelated uncertainties with the correlated systematics.
- validphys.covmats.t0_covmat_from_systematics(loaded_commondata_with_cuts, *, dataset_input, use_weights_in_covmat=True, norm_threshold=None, dataset_t0_predictions)[source]
Like
covmat_from_systematics()
except uses the t0 predictions to calculate the absolute constributions to the covmat from multiplicative uncertainties. For more info on the t0 predictions seevalidphys.commondata.dataset_t0_predictions()
.- Parameters
loaded_commondata_with_cuts (validphys.coredata.CommonData) – commondata object for which to generate the covmat.
dataset_input (validphys.core.DataSetInput) – Dataset settings, contains the weight for the current dataset. The returned covmat will be divided by the dataset weight if
use_weights_in_covmat
. The default weight is 1, which means the returned covmat will be unmodified.use_weights_in_covmat (bool) – Whether to weight the covmat, True by default.
dataset_t0_predictions (np.array) – 1-D array with t0 predictions.
- Returns
t0_covmat – t0 covariance matrix
- Return type
np.array
validphys.covmats_utils module
covmat_utils.py
Utils functions for constructing covariance matrices from systematics.
Leveraged by validphys.covmats
which contains relevant
actions/providers.
- validphys.covmats_utils.construct_covmat(stat_errors: array, sys_errors: DataFrame)[source]
Basic function to construct a covariance matrix (covmat), given the statistical error and a dataframe of systematics.
Errors with name UNCORR or THEORYUNCORR are added in quadrature with the statistical error to the diagonal of the covmat.
Other systematics are treated as correlated; their covmat contribution is found by multiplying them by their transpose.
- Parameters
stat_errors (np.array) – a 1-D array of statistical uncertainties
sys_errors (pd.DataFrame) – a dataframe with shape (N_data * N_sys) and systematic name as the column headers. The uncertainties should be in the same units as the data.
Notes
This function doesn’t contain any logic to ignore certain contributions to the covmat, if you wanted to not include a particular systematic/set of systematics i.e all uncertainties with MULT errors, then filter those out of
sys_errors
before passing that to this function.
- validphys.covmats_utils.systematics_matrix(stat_errors: array, sys_errors: DataFrame)[source]
Basic function to create a systematics matrix , \(A\), such that:
\[C = A A^T\]Where \(C\) is the covariance matrix. This is achieved by creating a block diagonal matrix by adding the uncorrelated systematics in quadrature then taking the square-root and concatenating the correlated systematics, schematically:
- Parameters
stat_errors (np.array) – a 1-D array of statistical uncertainties
sys_errors (pd.DataFrame) – a dataframe with shape (N_data * N_sys) and systematic name as the column headers. The uncertainties should be in the same units as the data.
Notes
This function doesn’t contain any logic to ignore certain contributions to the covmat, if you wanted to not include a particular systematic/set of systematics i.e all uncertainties with MULT errors, then filter those out of
sys_errors
before passing that to this function.
validphys.dataplots module
Plots of relations between data PDFs and fits.
- validphys.dataplots.kde_chi2dist_experiments(total_chi2_data, experiments_chi2_stats, pdf)[source]
KDE plot for experiments chi2.
- validphys.dataplots.plot_chi2dist(dataset, abs_chi2_data, chi2_stats, pdf)[source]
Plot the distribution of chi²s of the members of the pdfset.
- validphys.dataplots.plot_chi2dist_experiments(total_chi2_data, experiments_chi2_stats, pdf)[source]
Plot the distribution of chi²s of the members of the pdfset.
- validphys.dataplots.plot_chi2dist_sv(dataset, abs_chi2_data_thcovmat, pdf)[source]
Same as
plot_chi2dist
considering also the theory covmat in the calculation
- validphys.dataplots.plot_dataset_inputs_phi_dist(data, dataset_inputs_bootstrap_phi_data)[source]
Generates a bootstrap distribution of phi and then plots a histogram of the individual bootstrap samples for dataset_inputs. By default the number of bootstrap samples is set to a sensible number (500) however this number can be changed by specifying bootstrap_samples in the runcard
- validphys.dataplots.plot_datasets_chi2(groups_data, groups_chi2)[source]
Plot the chi² of all datasets with bars.
- validphys.dataplots.plot_datasets_chi2_spider(groups_data, groups_chi2)[source]
Plot the chi² of all datasets with bars.
- validphys.dataplots.plot_datasets_pdfs_chi2(data, each_dataset_chi2_pdfs, pdfs)[source]
Plot the chi² of all datasets with bars, and for different pdfs.
- validphys.dataplots.plot_datasets_pdfs_chi2_sv(data, each_dataset_chi2_pdfs_sv, pdfs)[source]
Same as
plot_datasets_pdfs_chi2_sv
with the chi²s computed including scale variations
- validphys.dataplots.plot_dataspecs_datasets_chi2(dataspecs_datasets_chi2_table)[source]
Same as plot_fits_datasets_chi2 but for arbitrary dataspecs
- validphys.dataplots.plot_dataspecs_datasets_chi2_spider(dataspecs_datasets_chi2_table)[source]
Same as plot_fits_datasets_chi2_spider but for arbitrary dataspecs
- validphys.dataplots.plot_dataspecs_groups_chi2(dataspecs_groups_chi2_table, processed_metadata_group)[source]
Same as plot_fits_groups_data_chi2 but for arbitrary dataspecs
- validphys.dataplots.plot_dataspecs_positivity(dataspecs_speclabel, dataspecs_positivity_predictions, dataspecs_posdataset, pos_use_kin=False)[source]
Like
plot_positivity()
except plots positivity for each element of dataspecs, allowing positivity predictions to be generated with differenttheory_id
s as well aspdf
s
- validphys.dataplots.plot_fancy(one_or_more_results, commondata, cuts, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, use_pdferr: bool = False)[source]
Read the PLOTTING configuration for the dataset and generate the corrspondig data theory plot.
The input results are assumed to be such that the first one is the data, and the subsequent ones are the predictions for the PDFfs. See
one_or_more_results
. The labelling of the predictions can be influenced by settinglabel
attribute of theories and pdfs.normalize_to: should be either ‘data’, a pdf id or an index of the result (0 for the data, and i for the ith pdf). None means plotting absolute values.
See docs/plotting_format.md for details on the format of the PLOTTING files.
- validphys.dataplots.plot_fancy_dataspecs(dataspecs_results, dataspecs_commondata, dataspecs_cuts, dataspecs_speclabel, normalize_to: (<class 'str'>, <class 'int'>, <class 'NoneType'>) = None, use_pdferr: bool = False)[source]
General interface for data-theory comparison plots.
The user should define an arbitrary list of mappings called “dataspecs”. In each of these,
dataset
must resolve to a dataset with the same name (but could be e.g. different theories). The production rulematched_datasets_from_datasepcs
may be used for this purpose.The result will be a plot combining all the predictions from the dataspecs mapping (whch could vary in theory, pdf, cuts, etc).
The user can define a “speclabel” key in each datasspec (or only on some). By default, the PDF label will be used in the legend (like in
plot_fancy
).normalize_to must
be either:The string ‘data’ or the integer 0 to plot the ratio to data,
or the 1-based index of the dataspec to normalize to the corresponding prediction,
or None (default) to plot absolute values.
A limitation at the moment is that the data cuts and errors will be taken from the first specifiaction.
- validphys.dataplots.plot_fancy_sv_dataspecs(dataspecs_results_with_scale_variations, dataspecs_commondata, dataspecs_cuts, dataspecs_speclabel, normalize_to: (<class 'str'>, <class 'int'>, <class 'NoneType'>) = None)[source]
Exactly the same as
plot_fancy_dataspecs
but the theoretical results passed down are modified so that the 1-sigma error bands correspond to a combination of the PDF error and the scale variations collected over theoryids
- validphys.dataplots.plot_fits_chi2_spider(fits, fits_groups_chi2, fits_groups_data, processed_metadata_group)[source]
Plots the chi²s of all groups of datasets on a spider/radar diagram.
- validphys.dataplots.plot_fits_datasets_chi2(fits_datasets_chi2_table)[source]
Generate a plot equivalent to
plot_datasets_chi2
using all the fitted datasets as input.
- validphys.dataplots.plot_fits_datasets_chi2_spider(fits_datasets_chi2_table)[source]
Generate a plot equivalent to
plot_datasets_chi2_spider
using all the fitted datasets as input.
- validphys.dataplots.plot_fits_datasets_chi2_spider_bygroup(fits_datasets_chi2_table)[source]
Same as plot_fits_datasets_chi2_spider but one plot for each group.
- validphys.dataplots.plot_fits_groups_data_chi2(fits_groups_chi2_table, processed_metadata_group)[source]
Generate a plot equivalent to
plot_groups_data_chi2
using all the fitted group of data as input.
- validphys.dataplots.plot_fits_groups_data_phi(fits_groups_phi_table, processed_metadata_group)[source]
Plots a set of bars for each fit, each bar represents the value of phi for the corresponding group of datasets, which is defined according to the keys in the PLOTTING info file
- validphys.dataplots.plot_fits_phi_spider(fits, fits_groups_data, fits_groups_data_phi, processed_metadata_group)[source]
Like plot_fits_chi2_spider but for phi.
- validphys.dataplots.plot_groups_data_chi2(groups_data, groups_chi2, processed_metadata_group)[source]
Plot the chi² of all groups of datasets with bars.
- validphys.dataplots.plot_groups_data_chi2_spider(groups_data, groups_chi2, processed_metadata_group, pdf)[source]
Plot the chi² of all groups of datasets as a spider plot.
- validphys.dataplots.plot_groups_data_phi_spider(groups_data, groups_data_phi, processed_metadata_group, pdf)[source]
Plot the phi of all groups of datasets as a spider plot.
- validphys.dataplots.plot_obscorrs(corrpair_datasets, obs_obs_correlations, pdf)[source]
NOTE: EXPERIMENTAL. Plot the correlation matrix between a pair of datasets.
- validphys.dataplots.plot_orbital_momentum(pdf, Q, partial_polarized_sum_rules)[source]
In addition to plotting the correlated spin moments as in plot_polarized_momentum, it also plots the contributions from the Orbital Angular Momentum.
- validphys.dataplots.plot_phi(groups_data, groups_data_phi, processed_metadata_group)[source]
plots phi for each group of data as a bar for a single PDF input
See phi_data for information on how phi is calculated
- validphys.dataplots.plot_phi_scatter_dataspecs(dataspecs_groups, dataspecs_speclabel, dataspecs_groups_bootstrap_phi)[source]
For each of the dataspecs, a bootstrap distribution of phi is generated for all specified groups of datasets. The distribution is then represented as a scatter point which is the median of the bootstrap distribution and an errorbar which spans the 68% confidence interval. By default the number of bootstrap samples is set to a sensible value, however it can be controlled by specifying bootstrap_samples in the runcard.
- validphys.dataplots.plot_polarized_momentum(pdf, Q, partial_polarized_sum_rules, angular_momentum=False)[source]
Plot the correlated uncertainties for the truncated integrals of the polarized gluon and singlet distributions.
- validphys.dataplots.plot_positivity(pdfs, positivity_predictions_for_pdfs, posdataset, pos_use_kin=False)[source]
Plot an errorbar spanning the central 68% CI of a positivity observable as well as a point indicating the central value (according to the
pdf.stats_class.central_value()
).Errorbars and points are plotted on a symlog scale as a function of the data point index (if pos_use_kin==False) or the first kinematic variable (if pos_use_kin==True).
- validphys.dataplots.plot_replica_sum_rules(pdf, sum_rules, Q)[source]
Plot the value of each sum rule as a function of the replica index
- validphys.dataplots.plot_smpdf(pdf, dataset, obs_pdf_correlations, mark_threshold: float = 0.9)[source]
Plot the correlations between the change in the observable and the change in the PDF in (x,fl) space.
mark_threshold is the proportion of the maximum absolute correlation that will be used to mark the corresponding area in x in the background of the plot. The maximum absolute values are used for the comparison.
Examples
>>> from validphys.api import API >>> data_input = { >>> "dataset_input" : {"dataset": "HERACOMBNCEP920"}, >>> "theoryid": 200, >>> "use_cuts": "internal", >>> "pdf": "NNPDF40_nnlo_as_01180", >>> "Q": 1.6, >>> "mark_threshold": 0.2 >>> } >>> smpdf_gen = API.plot_smpdf(**data_input) >>> fig = next(smpdf_gen) >>> fig.show()
- validphys.dataplots.plot_training_length(replica_data, fit)[source]
Generate an histogram for the distribution of training lengths in a given fit. Each bin is normalised by the total number of replicas.
- validphys.dataplots.plot_training_validation(fit, replica_data, replica_filters=None)[source]
Scatter plot with the training and validation chi² for each replica in the fit. The mean is also displayed as well as a line y=x to easily identify whether training or validation chi² is larger.
- validphys.dataplots.plot_trainvaliddist(fit, replica_data)[source]
KDEs for the trainning and validation distributions for each replica in the fit.
- validphys.dataplots.plot_xq2(dataset_inputs_by_groups_xq2map, use_cuts, data_input, display_cuts: bool = True, marker_by: str = 'process type', highlight_label: str = 'highlight', highlight_datasets: (<class 'collections.abc.Sequence'>, <class 'NoneType'>) = None, aspect: str = 'landscape')[source]
Plot the (x,Q²) coverage based of the data based on some LO approximations. These are governed by the relevant kintransform.
The representation of the filtered data depends on the display_cuts and use_cuts options:
If cuts are disabled (use_cuts is CutsPolicy.NOCUTS), all the data
will be plotted (and setting display_cuts to True is an error).
If cuts are enabled (use_cuts is either CutsPolicy.FROMFIT or
CutsPolicy.INTERNAL) and display_cuts is False, the masked points will be ignored.
If cuts are enabled and display_cuts is True, the filtered points
will be displaed and marked.
The points are grouped according to the marker_by option. The possible values are: “process type”, “experiment”, “group” or “dataset”.
Some datasets can be made to appear highlighted in the figure: Define a key called
highlight_datasets
containing the names of the datasets to be highlighted and a key highlight_label with a string containing the label of the highlight, which will appear in the legend.Example
Obtain a plot with some reasonable defaults:
from validphys.api import API inp = {'dataset_inputs': [{'dataset': 'NMCPD_dw'}, {'dataset': 'NMC'}, {'dataset': 'SLACP_dwsh'}, {'dataset': 'SLACD_dw'}, {'dataset': 'BCDMSP_dwsh'}, {'dataset': 'BCDMSD_dw'}, {'dataset': 'CHORUSNUPb_dw'}, {'dataset': 'CHORUSNBPb_dw'}, {'dataset': 'NTVNUDMNFe_dw', 'cfac': ['MAS']}, {'dataset': 'NTVNBDMNFe_dw', 'cfac': ['MAS']}, {'dataset': 'HERACOMBNCEM'}, {'dataset': 'HERACOMBNCEP460'}, {'dataset': 'HERACOMBNCEP575'}, {'dataset': 'HERACOMBNCEP820'}, {'dataset': 'HERACOMBNCEP920'}, {'dataset': 'HERACOMBCCEM'}, {'dataset': 'HERACOMBCCEP'}, {'dataset': 'HERACOMB_SIGMARED_C'}, {'dataset': 'HERACOMB_SIGMARED_B'}, {'dataset': 'DYE886R_dw'}, {'dataset': 'DYE886P', 'cfac': ['QCD']}, {'dataset': 'DYE605_dw', 'cfac': ['QCD']}, {'dataset': 'CDFZRAP_NEW', 'cfac': ['QCD']}, {'dataset': 'D0ZRAP', 'cfac': ['QCD']}, {'dataset': 'D0WMASY', 'cfac': ['QCD']}, {'dataset': 'ATLASWZRAP36PB', 'cfac': ['QCD']}, {'dataset': 'ATLASZHIGHMASS49FB', 'cfac': ['QCD']}, {'dataset': 'ATLASLOMASSDY11EXT', 'cfac': ['QCD']}, {'dataset': 'ATLASWZRAP11CC', 'cfac': ['QCD']}, {'dataset': 'ATLASWZRAP11CF', 'cfac': ['QCD']}, {'dataset': 'ATLASDY2D8TEV', 'cfac': ['QCDEWK']}, {'dataset': 'ATLAS_WZ_TOT_13TEV', 'cfac': ['NRM', 'QCD']}, {'dataset': 'ATLAS_WP_JET_8TEV_PT', 'cfac': ['QCD']}, {'dataset': 'ATLAS_WM_JET_8TEV_PT', 'cfac': ['QCD']}, {'dataset': 'ATLASZPT8TEVMDIST', 'cfac': ['QCD'], 'sys': 10}, {'dataset': 'ATLASZPT8TEVYDIST', 'cfac': ['QCD'], 'sys': 10}, {'dataset': 'ATLASTTBARTOT', 'cfac': ['QCD']}, {'dataset': 'ATLAS_TTB_DIFF_8TEV_LJ_TRAPNORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_TTB_DIFF_8TEV_LJ_TTRAPNORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_TOPDIFF_DILEPT_8TEV_TTRAPNORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_1JET_8TEV_R06_DEC', 'cfac': ['QCD']}, {'dataset': 'ATLAS_2JET_7TEV_R06', 'cfac': ['QCD']}, {'dataset': 'ATLASPHT15', 'cfac': ['QCD', 'EWK']}, {'dataset': 'ATLAS_SINGLETOP_TCH_R_7TEV', 'cfac': ['QCD']}, {'dataset': 'ATLAS_SINGLETOP_TCH_R_13TEV', 'cfac': ['QCD']}, {'dataset': 'ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP_NORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP_NORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP_NORM', 'cfac': ['QCD']}, {'dataset': 'ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP_NORM', 'cfac': ['QCD']}, {'dataset': 'CMSWEASY840PB', 'cfac': ['QCD']}, {'dataset': 'CMSWMASY47FB', 'cfac': ['QCD']}, {'dataset': 'CMSDY2D11', 'cfac': ['QCD']}, {'dataset': 'CMSWMU8TEV', 'cfac': ['QCD']}, {'dataset': 'CMSZDIFF12', 'cfac': ['QCD', 'NRM'], 'sys': 10}, {'dataset': 'CMS_2JET_7TEV', 'cfac': ['QCD']}, {'dataset': 'CMS_2JET_3D_8TEV', 'cfac': ['QCD']}, {'dataset': 'CMSTTBARTOT', 'cfac': ['QCD']}, {'dataset': 'CMSTOPDIFF8TEVTTRAPNORM', 'cfac': ['QCD']}, {'dataset': 'CMSTTBARTOT5TEV', 'cfac': ['QCD']}, {'dataset': 'CMS_TTBAR_2D_DIFF_MTT_TRAP_NORM', 'cfac': ['QCD']}, {'dataset': 'CMS_TTB_DIFF_13TEV_2016_2L_TRAP', 'cfac': ['QCD']}, {'dataset': 'CMS_TTB_DIFF_13TEV_2016_LJ_TRAP', 'cfac': ['QCD']}, {'dataset': 'CMS_SINGLETOP_TCH_TOT_7TEV', 'cfac': ['QCD']}, {'dataset': 'CMS_SINGLETOP_TCH_R_8TEV', 'cfac': ['QCD']}, {'dataset': 'CMS_SINGLETOP_TCH_R_13TEV', 'cfac': ['QCD']}, {'dataset': 'LHCBZ940PB', 'cfac': ['QCD']}, {'dataset': 'LHCBZEE2FB', 'cfac': ['QCD']}, {'dataset': 'LHCBWZMU7TEV', 'cfac': ['NRM', 'QCD']}, {'dataset': 'LHCBWZMU8TEV', 'cfac': ['NRM', 'QCD']}, {'dataset': 'LHCB_Z_13TEV_DIMUON', 'cfac': ['QCD']}, {'dataset': 'LHCB_Z_13TEV_DIELECTRON', 'cfac': ['QCD']}], 'use_cuts': 'internal', 'display_cuts': False, 'theoryid': 162, 'highlight_label': 'Old', 'highlight_datasets': ['NMC', 'CHORUSNUPb_dw', 'CHORUSNBPb_dw']} API.plot_xq2(**inp)
validphys.deltachi2 module
deltachi2.py
Plots and data processing that can be used in a delta chi2 analysis
- class validphys.deltachi2.PDFEpsilonPlotter(pdfs, xplotting_grids, xscale, normalize_to, ymin, ymax)[source]
Bases:
PDFPlotter
Subclassing PDFPlotter in order to plot epsilon (measure of gaussanity) for multiple PDFs, yielding a separate figure for each flavour
- validphys.deltachi2.check_pdf_is_symmhessian(pdf, **kwargs)[source]
Check
pdf
has error type ofsymmhessian
- validphys.deltachi2.check_pdfs_are_montecarlo(pdfs, **kwargs)[source]
Checks that the action is applied only to a pdf consisiting of MC replicas.
- validphys.deltachi2.delta_chi2_hessian(pdf, total_chi2_data)[source]
Return delta_chi2 (computed as in plot_delta_chi2_hessian) relative to each eigenvector of the Hessian set.
- validphys.deltachi2.plot_delta_chi2_hessian_distribution(delta_chi2_hessian, pdf, total_chi2_data)[source]
Plot of the chi2 difference between chi2 of each eigenvector of a symmHessian set and the central value for all experiments in a fit. As a function of every eigenvector in a first plot, and as a distribution in a second plot.
- validphys.deltachi2.plot_delta_chi2_hessian_eigenv(delta_chi2_hessian, pdf)[source]
Plot of the chi2 difference between chi2 of each eigenvector of a symmHessian set and the central value for all experiments in a fit. As a function of every eigenvector in a first plot, and as a distribution in a second plot.
- validphys.deltachi2.plot_epsilon(pdfs, xplotting_grids, xscale: (<class 'str'>, <class 'NoneType'>) = None, ymin=None, ymax=None, eps=None)[source]
Plot the discrepancy (epsilon) of the 1-sigma and 68% bands at each grid value for all pdfs for a given Q. See https://arxiv.org/abs/1505.06736 eq. (11)
xscale is read from pdf plotting_grid scale, which is ‘log’ by default.
eps defines the value at which plot a simple hline
- validphys.deltachi2.plot_kullback_leibler(delta_chi2_hessian)[source]
Determines the Kullback–Leibler divergence by comparing the expectation value of Delta chi2 to the cumulative distribution function of chi-square distribution with one degree of freedom (see: https://en.wikipedia.org/wiki/Chi-square_distribution).
The Kullback-Leibler divergence provides a measure of the difference between two distribution functions, here we compare the chi-squared distribution and the cumulative distribution of the expectation value of Delta chi2.
- validphys.deltachi2.plot_pos_neg_pdfs(pdf, pos_neg_xplotting_grids, xscale: (<class 'str'>, <class 'NoneType'>) = None, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, ymin=None, ymax=None, pdfs_noband: (<class 'list'>, <class 'NoneType'>) = None)[source]
Plot the the uncertainty of the original hessian pdfs, as well as that of the positive and negative subset.
validphys.eff_exponents module
Tools for computing and plotting effective exponents.
- class validphys.eff_exponents.ExponentBandPlotter(hlines, exponent, *args, **kwargs)[source]
Bases:
BandPDFPlotter
,PreprocessingPlotter
- draw(pdf, grid, flstate)[source]
Overload
BandPDFPlotter.draw()
to plot bands of the effective exponent calculated from the replicas and horizontal lines for the effective exponents of the previous/next fits, if possible.flstate
is an element of the flavours for the first pdf specified in pdfs. If this flavour doesn’t exist in the current pdf’s fitbasis or the set of flavours for which the preprocessing exponents exist for the current pdf no horizontal lines are plotted.
- class validphys.eff_exponents.PreprocessingPlotter(exponent, *args, **kwargs)[source]
Bases:
PDFPlotter
Class inherenting from BandPDFPlotter, changing title and ylabel to reflect the effective exponent being plotted.
- validphys.eff_exponents.alpha_eff(pdf: ~validphys.core.PDF, *, xmin: ~numbers.Real = 1e-06, xmax: ~numbers.Real = 0.001, npoints: int = 200, Q: ~numbers.Real = 1.65, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Return a list of xplotting_grids containing the value of the effective exponent alpha at the specified values of x and flavour. alpha is relevant at small x, hence the linear scale.
basis: Is one of the bases defined in pdfbases.py. This includes ‘flavour’ and ‘evolution’.
flavours: A set of elements from the basis. If None, the defaults for that basis will be selected.
Q: The PDF scale in GeV.
- validphys.eff_exponents.beta_eff(pdf, *, xmin: ~numbers.Real = 0.6, xmax: ~numbers.Real = 0.9, npoints: int = 200, Q: ~numbers.Real = 1.65, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Return a list of xplotting_grids containing the value of the effective exponent beta at the specified values of x and flavour. beta is relevant at large x, hence the linear scale.
basis: Is one of the bases defined in pdfbases.py. This includes ‘flavour’ and ‘evolution’.
flavours: A set of elements from the basis. If None, the defaults for that basis will be selected.
Q: The PDF scale in GeV.
- validphys.eff_exponents.effective_exponents_table_internal(next_effective_exponents_table, *, fit=None, basis)[source]
Returns a table which concatenates previous_effective_exponents_table and next_effective_exponents_table if both tables contain effective exponents in the same basis.
If the previous exponents are in a different basis, or no fit was given to read the previous exponents from, then only the next exponents table is returned, for plotting purposes.
- validphys.eff_exponents.fmt(a)
- validphys.eff_exponents.get_alpha_lines(effective_exponents_table_internal)[source]
Given an effective_exponents_table_internal returns the rows with bounds of the alpha effective exponent for all flavours, used to plot horizontal lines on the alpha effective exponent plots.
- validphys.eff_exponents.get_beta_lines(effective_exponents_table_internal)[source]
Same as get_alpha_lines but for beta
- validphys.eff_exponents.iterate_preprocessing_yaml(fit, next_fit_eff_exps_table, _flmap_np_clip_arg=None)[source]
Using py:func:next_effective_exponents_table update the preprocessing exponents of the input
fit
. This is part of the usual pipeline referred to as “iterating a fit”, for more information see: How to run an iterated fit. A fully iterated runcard can be obtained from the actioniterated_runcard_yaml()
.This action can be used in a report but should be wrapped in a code block to be formatted correctly, for example:
`yaml {@iterate_preprocessing_yaml@} `
Alternatively, using the API, the yaml dump returned by this function can be written to a file e.g
>>> from validphys.api import API >>> yaml_output = API.iterate_preprocessing_yaml(fit=<fit name>) >>> with open("output.yml", "w+") as f: ... f.write(yaml_output)
- Parameters
fit (validphys.core.FitSpec) – Whose preprocessing range will be iterated, the output runcard will be the same as the one used to run this fit, except with new preprocessing range.
next_fit_eff_exps_table (pd.DataFrame) – Table outputted by
next_fit_eff_exps_table()
containing the next preprocessing ranges._flmap_np_clip_arg (dict) – Internal argument used by
vp-nextfitruncard
. Dictionary containing a mapping like{<flavour>: {<largex/smallx>: {a_min: <min value>, a_max: <max value>}}}
. If a flavour is present in_flmap_np_clip_arg
then the preprocessing ranges will be passed throughnp.clip
with the arguments supplied in the mapping.
- validphys.eff_exponents.iterated_runcard_yaml(fit, update_runcard_description_yaml)[source]
Takes the runcard with preprocessing iterated and description updated then
Updates the t0 pdf set to be
fit
Modifies the random seeds (to random unsigned long ints)
This should facilitate running a new fit with identical input settings as the specified
fit
with the t0, seeds and preprocessing iterated. For more information see: How to run an iterated fitThis action can be used in a report but should be wrapped in a code block to be formatted correctly, for example:
`yaml {@iterated_runcard_yaml@} `
alternatively, using the API, the yaml dump returned by this function can be written to a file e.g
>>> from validphys.api import API >>> yaml_output = API.iterated_runcard_yaml( ... fit=<fit name>, ... _updated_description="My iterated fit" ... ) >>> with open("output.yml", "w+") as f: ... f.write(yaml_output)
- validphys.eff_exponents.next_effective_exponents_table(pdf: ~validphys.core.PDF, *, fitq0fromfit: (<class 'numbers.Real'>, <class 'NoneType'>) = None, x1_alpha: ~numbers.Real = 1e-06, x2_alpha: ~numbers.Real = 0.001, x1_beta: ~numbers.Real = 0.65, x2_beta: ~numbers.Real = 0.95, basis: (<class 'str'>, <class 'validphys.pdfbases.Basis'>), flavours: (<class 'list'>, <class 'tuple'>, <class 'NoneType'>) = None)[source]
Given a PDF, calculate the next effective exponents
By default x1_alpha = 1e-6, x2_alpha = 1e-3, x1_beta = 0.65, and x2_beta = 0.95, but different values can be specified in the runcard. The values control where the bounds of alpha and beta are evaluated:
- alpha_min:
singlet/gluon: the 2x68% c.l. lower value evaluated at x=`x1_alpha` others : min(2x68% c.l. lower value evaluated at x=`x1_alpha` and x=`x2_alpha`)
- alpha_max:
singlet/gluon: min(2 and the 2x68% c.l. upper value evaluated at x=`x1_alpha`) others : min(2 and max(2x68% c.l. upper value evaluated at x=`x1_alpha`
and x=`x2_alpha`))
- beta_min:
max(0 and min(2x68% c.l. lower value evaluated at x=`x1_beta` and x=`x2_beta`))
- beta_max:
max(2x68% c.l. upper value evaluated at x=`x1_beta` and x=`x2_beta`)
- validphys.eff_exponents.plot_alpha_eff(fits_pdf, alpha_eff_fits, fits_alpha_lines, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, ybottom=None, ytop=None)[source]
Plot the central value and the uncertainty of a list of effective exponents as a function of x for a given value of Q. If normalize_to is given, plot the ratios to the corresponding alpha effective. Otherwise, plot absolute values. See the help for
xplotting_grid
for information on how to set basis, flavours and x ranges. Yields one figure per PDF flavour.normalize_to: Either the name of one of the alpha effective or its corresponding index in the list, starting from one, or None to plot absolute values.
xscale: One of the matplotlib allowed scales. If undefined, it will be set based on the scale in xgrid, which should be used instead.
- validphys.eff_exponents.plot_alpha_eff_internal(pdfs, alpha_eff_pdfs, pdfs_alpha_lines, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, ybottom=None, ytop=None)[source]
Plot the central value and the uncertainty of a list of effective exponents as a function of x for a given value of Q. If normalize_to is given, plot the ratios to the corresponding alpha effective. Otherwise, plot absolute values. See the help for
xplotting_grid
for information on how to set basis, flavours and x ranges. Yields one figure per PDF flavour.normalize_to: Either the name of one of the alpha effective or its corresponding index in the list, starting from one, or None to plot absolute values.
- validphys.eff_exponents.plot_beta_eff(fits_pdf, beta_eff_fits, fits_beta_lines, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, ybottom=None, ytop=None)[source]
Same as plot_alpha_eff but for beta effective exponents
- validphys.eff_exponents.plot_beta_eff_internal(pdfs, beta_eff_pdfs, pdfs_beta_lines, normalize_to: (<class 'int'>, <class 'str'>, <class 'NoneType'>) = None, ybottom=None, ytop=None)[source]
Same as plot_alpha_eff_internal but for beta effective exponent
- validphys.eff_exponents.previous_effective_exponents(basis: str, fit: (<class 'validphys.core.FitSpec'>, <class 'NoneType'>) = None)[source]
If provided with a fit, check that the basis is the basis which was fitted if so then return the previous effective exponents read from the fit runcard.
- validphys.eff_exponents.previous_effective_exponents_table(fit: FitSpec)[source]
Given a fit, reads the previous exponents from the fit runcard
- validphys.eff_exponents.update_runcard_description_yaml(iterate_preprocessing_yaml, _updated_description=None)[source]
Take the runcard with iterated preprocessing and update the description if
_updated_description
is provided. As withiterate_preprocessing_yaml()
the result can be used in a report but should be wrapped in a code block to be formatted correctly, for example:`yaml {@update_runcard_description_yaml@} `
validphys.filters module
Filters for NNPDF fits
- class validphys.filters.AddedFilterRule(dataset: Optional[str] = None, process_type: Optional[str] = None, rule: Optional[str] = None, reason: Optional[str] = None, local_variables: Optional[Mapping[str, Union[str, float]]] = None, PTO: Optional[str] = None, FNS: Optional[str] = None, IC: Optional[str] = None)[source]
Bases:
FilterRule
Dataclass which carries extra filter rule that is added to the default rule.
- exception validphys.filters.BadPerturbativeOrder[source]
Bases:
ValueError
Exception raised when the perturbative order string is not recognized.
- exception validphys.filters.FatalRuleError[source]
Bases:
Exception
Exception raised when a rule application failed at runtime.
- class validphys.filters.FilterDefaults(q2min: Optional[float] = None, w2min: Optional[float] = None, maxTau: Optional[float] = None)[source]
Bases:
object
Dataclass carrying default values for filters (cuts) taking into account the values of
q2min
,w2min
andmaxTau
.
- class validphys.filters.FilterRule(dataset: Optional[str] = None, process_type: Optional[str] = None, rule: Optional[str] = None, reason: Optional[str] = None, local_variables: Optional[Mapping[str, Union[str, float]]] = None, PTO: Optional[str] = None, FNS: Optional[str] = None, IC: Optional[str] = None)[source]
Bases:
object
Dataclass which carries the filter rule information.
- exception validphys.filters.MissingRuleAttribute[source]
Bases:
RuleProcessingError
,AttributeError
Exception raised when a rule is missing required attributes.
- class validphys.filters.PerturbativeOrder(string)[source]
Bases:
object
Class that conveniently handles perturbative order declarations for use within the Rule class filter.
- Parameters
string (str) –
A string in the format of NNLO or equivalently N2LO. This can be followed by one of ! + - or none.
The syntax allows for rules to be executed only if the perturbative order is within a given range. The following enumerates all 4 cases as an example:
NNLO+ only execute the following rule if the pto is 2 or greater NNLO- only execute the following rule if the pto is strictly less than 2 NNLO! only execute the following rule if the pto is strictly not 2 NNLO only execute the following rule if the pto is exactly 2
Any unrecognized string will raise a BadPerturbativeOrder exception.
Example
>>> from validphys.filters import PerturbativeOrder >>> pto = PerturbativeOrder("NNLO+") >>> pto.numeric_pto 2 >>> 1 in pto False >>> 2 in pto True >>> 3 in pto True
- class validphys.filters.Rule(initial_data: FilterRule, *, defaults: dict, theory_parameters: dict, loader=None)[source]
Bases:
object
Rule object to be used to generate cuts mask.
A rule object is created for each rule in ./cuts/filters.yaml
Old commondata relied on the order of the kinematical variables to be the same as specified in the KIN_LABEL dictionary set in this module. The new commondata specification instead defines explicitly the name of the variables in the metadata. Therefore, when using a new-format commondata, the KIN_LABEL dictionary will not be used and the variables defined in it will be used instead.
- Parameters
initial_data (dict) –
A dictionary containing all the information regarding the rule. This contains the name of the dataset the rule to applies to and/or the process type the rule applies to. Additionally, the rule itself is defined, alongside the reason the rule is used. Finally, the user can optionally define their own custom local variables.
By default these are defined in cuts/filters.yaml
defaults (dict) –
A dictionary containing default values to be used globally in all rules.
By default these are defined in cuts/defaults.yaml
theory_parameters – Dict containing pairs of (theory_parameter, value)
loader (validphys.loader.Loader, optional) – A loader instance used to retrieve the datasets.
- numpy_functions = {'fabs': <ufunc 'fabs'>, 'log': <ufunc 'log'>, 'sqrt': <ufunc 'sqrt'>}
- exception validphys.filters.RuleProcessingError[source]
Bases:
Exception
Exception raised when we couldn’t process a rule.
- validphys.filters.check_additional_errors(additional_errors)[source]
Lux additional errors pdf check
- validphys.filters.check_integrability(integdatasets)[source]
Verify positive datasets are ready for the fit.
- validphys.filters.check_positivity(posdatasets)[source]
Verify positive datasets are ready for the fit.
- validphys.filters.check_unpolarized_bc(unpolarized_bc)[source]
Check that unpolarized PDF bound can be loaded normally.
- validphys.filters.default_filter_rules_input()[source]
Return a tuple of FilterRule objects. These are defined in
filters.yaml
in thevalidphys.cuts
module.
- validphys.filters.default_filter_settings_input()[source]
Return a FilterDefaults dataclass with the default hardcoded filter settings. These are defined in
defaults.yaml
in thevalidphys.cuts
module.
- validphys.filters.filter_closure_data(filter_path, data, fakepdf, fakenoise, filterseed, sep_mult)[source]
Filter closure data. In addition to cutting data points, the data is generated from an underlying
fakepdf
, applying a shift to the data iffakenoise
isTrue
, which emulates the experimental central values being shifted away from the underlying law.
- validphys.filters.filter_closure_data_by_experiment(filter_path, experiments_data, fakepdf, fakenoise, filterseed, data_index, sep_mult)[source]
Like
filter_closure_data()
except filters data by experiment.This function just peforms a
for
loop overexperiments
, the reason we don’t usereportengine.collect
is that it can permute the order in which closure data is generate, which means that the pseudodata is not reproducible.
- validphys.filters.filter_real_data(filter_path, data)[source]
Filter real data, cutting any points which do not pass the filter rules.
- validphys.filters.get_cuts_for_dataset(commondata, rules) list [source]
Function to generate a list containing the index of all experimental points that passed kinematic cut rules stored in ./cuts/filters.yaml
- Parameters
commondata (validphys.coredata.CommonData) –
rules (List[Rule]) – A list of Rule objects specifying the filters.
- Returns
mask – List object containing index of all passed experimental values
- Return type
Example
>>> from validphys.filters import (get_cuts_for_dataset, Rule, ... default_filter_settings, default_filter_rules_input) >>> from validphys.loader import Loader >>> l = Loader() >>> cd = l.check_commondata("NMC") >>> theory = l.check_theoryID(53) >>> filter_defaults = default_filter_settings() >>> params = theory.get_description() >>> rule_list = [Rule(initial_data=i, defaults=filter_defaults, theory_parameters=params) ... for i in default_filter_rules_input()] >>> get_cuts_for_dataset(cd, rules=rule_list)
validphys.fitdata module
Utilities for loading data from fit folders
- class validphys.fitdata.DatasetComp(common, first_only, second_only)
Bases:
tuple
- common
Alias for field number 0
- first_only
Alias for field number 1
- second_only
Alias for field number 2
- class validphys.fitdata.FitInfo(nite, training, validation, chi2, is_positive, arclengths, integnumbers)
Bases:
tuple
- arclengths
Alias for field number 5
- chi2
Alias for field number 3
- integnumbers
Alias for field number 6
- is_positive
Alias for field number 4
- nite
Alias for field number 0
- training
Alias for field number 1
- validation
Alias for field number 2
- validphys.fitdata.check_lhapdf_info(results_dir, fitname)[source]
Check that an LHAPDF info metadata file is present in the fit results
- validphys.fitdata.check_nnfit_results_path(path)[source]
Returns True if the requested path is a valid results directory, i.e if it is a directory and has a ‘nnfit’ subdirectory
- validphys.fitdata.check_replica_files(replica_path, prefix)[source]
Verification of a replica results directory at replica_path for a fit named prefix. Returns True if the results directory is complete
- validphys.fitdata.datasets_properties_table(data_input)[source]
Return dataset properties for each dataset in
data_input
- validphys.fitdata.fit_code_version(fit)[source]
Returns table with the code version from
replica_1/{fitname}.json
files. Note that the version for thensorflow distinguishes between the mkl=on and off version
- validphys.fitdata.fit_datasets_properties_table(fitinputcontext)[source]
Returns table of dataset properties for each dataset used in a fit.
- validphys.fitdata.fit_summary(fit_name_with_covmat_label, replica_data, total_chi2_data, total_phi_data)[source]
Summary table of fit properties - Central chi-squared - Average chi-squared - Training and Validation error functions - Training lengths - Phi
Note: Chi-squared values from the replica_data are not used here (presumably they are fixed to being t0)
This uses a corrected form for the error on phi in comparison to the vp1 value. The error is propagated from the uncertainty on the average chi-squared only.
- validphys.fitdata.fit_theory_covmat_summary(fit, fitthcovmat)[source]
returns a table with a single column for the fit, with three rows indicating if the theory covariance matrix was used in the ‘sampling’ of the pseudodata, the ‘fitting’, and the ‘validphys statistical estimators’ in the current namespace for that fit.
Return a table with the same columns as
replica_data
indexed by the replica fit ID. For identical fits, the values across rows should be the same.If some replica ID is not present for a given fit (e.g. discarded by postfit), the corresponding entries in the table will be null.
- validphys.fitdata.fits_version_table(fits_fit_code_version)[source]
Produces a table of version information for multiple fits.
- validphys.fitdata.load_fitinfo(replica_path, prefix)[source]
Process the data in the
.json.
file for a single replica into aFitInfo
object. If the.json
file does not exist an old-format fit is assumed andold_load_fitinfo
will be called instead.
- validphys.fitdata.match_datasets_by_name(fits, fits_datasets)[source]
Return a tuple with common, first_only and second_only. The elements of the tuple are mappings where the keys are dataset names and the values are the two datasets contained in each fit for common, and the corresponfing dataset inclucded only in the first fit and only in the second fit.
- validphys.fitdata.num_fitted_replicas(fit)[source]
Function to obtain the number of nnfit replicas. That is the number of replicas before postfit was run.
- validphys.fitdata.print_dataset_differences(fits, match_datasets_by_name, print_common: bool = True)[source]
Given exactly two fits, print the datasets that are included in one ” “but not in the other. If print_common is True, also print the datasets that are common.
For the purposes of visual aid, everything is ordered by the dataset name, in terms of the the convention for the commondata means that everything is order by:
Experiment name
Process
Energy
- validphys.fitdata.print_different_cuts(fits, test_for_same_cuts)[source]
Print a summary of the datasets that are included in both fits but have different cuts.
- validphys.fitdata.print_systype_overlap(groups_commondata, group_dataset_inputs_by_metadata)[source]
Returns a set of systypes that overlap between groups. Discards the set of systypes which overlap but do not imply correlations
- validphys.fitdata.replica_data(fit, replica_paths)[source]
Load the necessary data from the
.json
file of each of the replicas. The corresponding PDF set must be installed in the LHAPDF path.The included information is:
(‘nite’, ‘training’, ‘validation’, ‘chi2’, ‘pos_status’, ‘arclenghts’)
- validphys.fitdata.summarise_fits(collected_fit_summaries)[source]
Produces a table of basic comparisons between fits, includes all the fields used in fit_summary
- validphys.fitdata.summarise_theory_covmat_fits(fits_theory_covmat_summary)[source]
Collects the theory covmat summary for all fits and concatenates them into a single table
- validphys.fitdata.t0_chi2_info_table(pdf, dataset_inputs_abs_chi2_data, t0pdfset, use_t0)[source]
Provides table with - t0pdfset name - Central t0-chi-squared - Average t0-chi-squared
- validphys.fitdata.test_for_same_cuts(fits, match_datasets_by_name)[source]
Given two fits, return a list of tuples (first, second) where first and second are DatasetSpecs that correspond to the same dataset but have different cuts, such that first is included in the first fit and second in the second.
validphys.fitveto module
fitveto.py
Module for the determination of passing fit replicas.
- Current active vetoes:
Positivity - Replicas with FitInfo.is_positive == False ChiSquared - Replicas with ChiSquared > nsigma_discard_chi2*StandardDev + Average ArclengthX - Replicas with ArcLengthX > nsigma_discard_arclength*StandardDev + Average Integrability - Replicas with IntegrabilityNumbers < integ_threshold
- validphys.fitveto.determine_vetoes(fitinfos: list, nsigma_discard_chi2: float, nsigma_discard_arclength: float, integ_threshold: float)[source]
Assesses whether replica fitinfo passes standard NNPDF vetoes Returns a dictionary of vetoes and their passing boolean masks. Included in the dictionary is a ‘Total’ veto.
- validphys.fitveto.distribution_veto(dist, prior_mask, nsigma_threshold)[source]
For a given distribution (a list of floats), returns a boolean mask specifying the passing elements. The result is a new mask of the elements that satisfy:
value <= mean + nsigma_threshold*standard_deviation
Only points passing the prior_mask are considered in the average or standard deviation.
validphys.fkparser module
This module implements parsers for FKtable and CFactor files into useful
datastructures, contained in the validphys.coredata
module, which can
be easily pickled and interfaced with common Python libraries.
Most users will be interested in using the high level interface
load_fktable()
. Given a validphys.core.FKTableSpec
object, it returns an instance of validphys.coredata.FKTableData
,
an object with the required information to compute a convolution, with the
CFactors applied.
from validphys.fkparser import load_fktable
from validphys.loader import Loader
l = Loader()
fk = l.check_fktable(setname="ATLASTTBARTOT", theoryID=53, cfac=('QCD',))
res = load_fktable(fk)
- exception validphys.fkparser.BadCFactorError[source]
Bases:
Exception
Exception raised when an CFactor cannot be parsed correctly
- exception validphys.fkparser.BadFKTableError[source]
Bases:
Exception
Exception raised when an FKTable cannot be parsed correctly
- class validphys.fkparser.GridInfo(setname: str, hadronic: bool, ndata: int, nx: int)[source]
Bases:
object
Class containing the basic properties of an FKTable grid.
- validphys.fkparser.load_fktable(spec)[source]
Load the data corresponding to a FKSpec object. The cfactors will be applied to the grid. If we have a new-type fktable, call directly load(), otherwise fallback to the old parser
- validphys.fkparser.open_fkpath(path)[source]
Return a file-like object from the fktable path, regardless of whether it is compressed
Parameters
- path: Path or str
Path like file containing a valid FKTable. It can be either inside a tarball or in plain text.
- returns
f – A file like object for further processing.
- rtype
file
- validphys.fkparser.parse_cfactor(f)[source]
Parse an open byte stream into a :py:class`CFactorData`. Raise a BadCFactorError if problems are encountered.
- Parameters
f (file) – Binary file-like object
- Returns
cfac – An object containing the data on the cfactor for each point.
- Return type
- validphys.fkparser.parse_fktable(f)[source]
Parse an open byte stream into an FKTableData. Raise a BadFKTableError if problems are encountered.
- Parameters
f (file) – Open file-like object. See :func:`open_fkpath`to obtain it.
- Returns
fktable – An object containing the FKTable data and information.
- Return type
Notes
This function operates at the level of a single file, and therefore it does not apply CFactors (see
load_fktable()
for that) or handle operations within COMPOUND ensembles.
validphys.gridvalues module
gridvalues.py
Core functionality needed to obtain a set of values from LHAPDF. The tools for representing these grids are in pdfgrids.py (the validphys provider module), and the basis transformations are in pdfbases.py
- validphys.gridvalues.central_grid_values(pdf: PDF, flmat, xmat, qmat)[source]
Same as
grid_values()
but it returns only the central values. The return value is indexed as:grid_values[replica][flavour][x][Q]
where the first dimension (coresponding to the central member of the PDF set) is always one.
- validphys.gridvalues.evaluate_luminosity(pdf_set: LHAPDFSet, n: int, s: float, mx: float, x1: float, x2: float, channel)[source]
Returns PDF luminosity at specified values of mx, x1, x2, sqrts**2 for a given channel.
pdf_set: The interested PDF set s: The square of the center of mass energy GeV^2. mx: The invariant mass bin GeV. x1 and x2: The partonic x1 and x2. channel: The channel tag name from LUMI_CHANNELS.
- validphys.gridvalues.grid_values(pdf: PDF, flmat, xmat, qmat)[source]
Evaluate
x*f(x)
on a grid of points in flavour, x and Q.- Parameters
pdf (PDF) – Any PDF set
flmat (iterable) – A list of PDG IDs corresponding the the LHAPDF flavours in the grid.
xmat (iterable) – A list of x values
qmat (iterable) – A list of values in Q, expressed in GeV.
- Returns
A 4-dimension array with the PDF values at the input parameters
for each replica. The return value is indexed as follows:: – grid_values[replica][flavour][x][Q]
See also
validphys.pdfbases.Basis.grid_values()
,interface
,allowing
,and
,aliases
Examples
Compute the maximum difference across replicas between the u and ubar PDFs (times x) for x=0.05 and both Q=10 and Q=100:
>>> from validphys.loader import Loader >>> from validphys.gridvalues import grid_values >>> import numpy as np >>> gv = grid_values(Loader().check_pdf('NNPDF31_nnlo_as_0118'), [-1, 1], [0.5], [10, 100]) >>> #Take the difference across the flavour dimension, the max >>> #across the replica dimension, and leave the Q dimension untouched. >>> np.diff(gv, axis=1).max(axis=0).ravel() array([0.07904731, 0.04989902], dtype=float32)
validphys.hyper_algorithm module
This module contains functions dedicated to process the json dictionaries
- validphys.hyper_algorithm.autofilter_dataframe(dataframe, keys, n_to_combine=1, n_to_kill=1, threshold=-1)[source]
Receives a dataframe and a list of keys. Creates combinations of n_to_combine keys and computes the reward Finally removes from the dataframe the n_to_kill worse combinations
Anything under threshold will be removed and will not count towards the n_to_kill (by default threshold = -50 so only things which are really bad will be removed)
- # Arguments:
dataframe: a pandas dataframe
keys: keys to combine
n_to_combine: how many keys do we want to combine
n_to_kill: how many combinations to kill
threshold: anything under this reward will be removed
- # Returns:
- dataframe_sliced: a slice of the dataframe with the weakest combinations
removed
- validphys.hyper_algorithm.bin_generator(df_values, max_n=10)[source]
Receives a dataframe with a list of unique values . If there are more than max_n of them and they are numeric, create max_n bins. If they are already discrete values or there are less than max_n options, output the same input
- # Arguments:
df_values: dataframe with unique values
maximum: maximum number of allowed different values
- # Returns:
new_vals: list of tuples with (initial, end) value of the bin
- validphys.hyper_algorithm.compute_reward(mdict, biggest_ntotal)[source]
Given a combination dictionary computes the reward function:
If the fail rate for this combination is above the fail threshold, rewards is -100
- The formula below for the reward takes into account:
The rate of ok fits that have a loss below the loss_threshold
The rate of fits that failed
The std deviation
How far away is the median from the best loss
How far away are median and average
- validphys.hyper_algorithm.dataframe_removal(dataframe, hit_list)[source]
Removes all combinations defined in hit_list from the dataframe. The hit list is list of dictionaries containing the ‘slice’ key where ‘slice’ must be a slice of ‘dataframe’
- # Arguments:
dataframe: a pandas dataframe
hit_list: the list of element to remove
- # Returns:
new_dataframe: the same dataframe with all elements from hit_list removed
- validphys.hyper_algorithm.get_combinations(key_info, ncomb)[source]
Given a dictionary mapping keys to iterables of possible values (key_info), return a list of the product of all possible mappings of a subset of ncomb keys to single values out of the corresponding possible values, for all such subsets.
For instance, key_info = {
‘key1’ : [val1-1, val1-2, …], ‘key2’ : [val2-1, val2-2, …], }
ncomb = 2
will return a list of dictionaries: [ {‘key1’ : val1-1, ‘key2’, val2-1 … }, {‘key1’ : val1-1, ‘key2’, val2-2 … }, {‘key1’ : val1-2, ‘key2’, val2-1 … }, {‘key1’ : val1-2, ‘key2’, val2-2 … }, ]
Get all combinations of ncomb elements for the keys and values given in the dictionary key_info:
- # Arguments:
key_info: dictionary with the possible values for each key
ncomb: elements to combine
- # Returns:
all_combinations: A list of dictionaries of parameters
- validphys.hyper_algorithm.get_slice(dataframe, query_dict)[source]
Returns a slice of the dataframe where some keys match some values keys_info must be a dictionary {key1 : value1, key2, value2 …} # Arguments:
dataframe: a pandas dataframe
query_dict: a dictionary of combination as given by get_combinations
- validphys.hyper_algorithm.parse_keys(dataframe, keys)[source]
Receives a dataframe and a set of keys Looks into the dataframe to read the possible values of the keys
Returns a dictionary { ‘key’ : [possible values] },
If the values are not discrete then we need to bin it let’s do this for anything with two many numerical values
- # Arguments:
dataframe: a pandas dataframe
keys: keys to combine
- # Returns:
key_info: a dictionary with the possible values for each key
- validphys.hyper_algorithm.process_slice(df_slice)[source]
Function to process a slice into a dictionary with useful stats If the slice is None it means the combination does not apply
- # Arguments:
df_slice: a slice of a pandas dataframe
- # Returns:
proc_dict: a dictionary of stats
- validphys.hyper_algorithm.study_combination(dataframe, query_dict)[source]
Given a dataframe and a dictionary of {key1 : value1, key2: value2} returns a dictionary with a number of stats for that combination
- # Arguments:
dataframe: a pandas dataframe
query_dict: a dictionary for a combination as given by get_combinations
- # Returns:
proc_dict: a dictionary of the “statistics” for this combination
validphys.hyperoptplot module
Module for the parsing and plotting of the results and output of previous hyperparameter scans
- class validphys.hyperoptplot.HyperoptTrial(trial_dict, base_params=None, minimum_losses=1, linked_trials=None)[source]
Bases:
object
Hyperopt trial class. Makes the dictionary-like output of
hyperopt
into an object that can be easily managed- Parameters
trial_dict (dict) – one single result (a dictionary) from a
tries.json
filebase_params (dict) – Base parameters of the runcard which can be used to complete the hyperparameter dictionary when not all parameters were scanned
minimum_losses (int) – Minimum number of losses to be found in the trial for it to be considered succesful
linked_trials (list) – List of trials coming from the same file as this trial
- property loss
Return the loss of the hyperopt dict
- property params
Parameters for the fit
- property reward
Return and cache the reward value
- property weighted_reward
Return the reward weighted to the mean value of the linked trials
- validphys.hyperoptplot.best_setup(hyperopt_dataframe, hyperscan_config, commandline_args)[source]
Generates a clean table with information on the hyperparameter settings of the best setup.
- validphys.hyperoptplot.evaluate_trial(trial_dict, validation_multiplier, fail_threshold, loss_target)[source]
Read a trial dictionary and compute the true loss and decide whether the run passes or not
- validphys.hyperoptplot.filter_by_string(filter_string)[source]
Receives a data_dict (a parsed trial) and a filter string, returns True if the trial passes the filter
filter string must have the format: key<operator>string where <operator> can be any of !=, =, >, <
- # Arguments:
filter_string: the expresion to evaluate
- # Returns:
- filter_function: a function that takes a data_dict and
returns true if the condition in filter_string passes
- validphys.hyperoptplot.generate_dictionary(replica_path, loss_target, json_name='tries.json', starting_index=0, val_multiplier=0.5, fail_threshold=10.0)[source]
Reads a json file and returns a list of dictionaries
- # Arguments:
replica_path: folder in which the tries.json file can be found
- starting_index: if the trials are to be added to an already existing
set, make sure the id has the correct index!
val_multiplier: validation multipler
fail_threhsold: threshold for the loss to consider a configuration as a failure
- validphys.hyperoptplot.hyperopt_dataframe(commandline_args)[source]
Loads the data generated by running hyperopt and stored in json files into a dataframe, and then filters the data according to the selection criteria provided by the command line arguments. It then returns both the entire dataframe as well as a dataframe object with the hyperopt parametesr of the best setup.
- validphys.hyperoptplot.hyperopt_table(hyperopt_dataframe)[source]
Generates a table containing complete information on all the tested setups that passed the filters set in the commandline arguments.
- validphys.hyperoptplot.order_axis(df, bestdf, key)[source]
Helper function for ordering the axis and make sure the best is always first
- validphys.hyperoptplot.parse_architecture(trial)[source]
This function parses the family of parameters which regards the architecture of the NN
number_of_layers activation_per_layer nodes_per_layer l1, l2, l3, l4… max_layers layer_type dropout initializer
- validphys.hyperoptplot.parse_optimizer(trial)[source]
This function parses the parameters that affect the optimization
optimizer learning_rate (if it exists)
- validphys.hyperoptplot.parse_statistics(trial)[source]
Parse the statistical information of the trial
validation loss testing loss status of the run
- validphys.hyperoptplot.parse_stopping(trial)[source]
This function parses the parameters that affect the stopping
epochs stopping_patience pos_initial pos_multiplier
- validphys.hyperoptplot.parse_trial(trial)[source]
Trials are very convoluted object, very branched inside The goal of this function is to separate said branching so we can create hierarchies
- validphys.hyperoptplot.plot_activation_per_layer(hyperopt_dataframe)[source]
Generates a violin plot of the loss per activation function.
- validphys.hyperoptplot.plot_clipnorm(hyperopt_dataframe, optimizer_name)[source]
Generates a scatter plot of the loss as a function of the clipnorm for a given optimizer.
- validphys.hyperoptplot.plot_epochs(hyperopt_dataframe)[source]
Generates a scatter plot of the loss as a function the number of epochs.
- validphys.hyperoptplot.plot_initializer(hyperopt_dataframe)[source]
Generates a violin plot of the loss per initializer.
- validphys.hyperoptplot.plot_iterations(hyperopt_dataframe)[source]
Generates a scatter plot of the loss as a function of the iteration index.
- validphys.hyperoptplot.plot_learning_rate(hyperopt_dataframe, optimizer_name)[source]
Generates a scatter plot of the loss as a function of the learning rate for a given optimizer.
- validphys.hyperoptplot.plot_number_of_layers(hyperopt_dataframe)[source]
Generates a violin plot of the loss as a function of the number of layers of the model.
validphys.kinematics module
Provides information on the kinematics involved in the data.
Uses the PLOTTING file specification.
- class validphys.kinematics.XQ2Map(experiment, commondata, fitted, masked, group)
Bases:
tuple
- commondata
Alias for field number 1
- experiment
Alias for field number 0
- fitted
Alias for field number 2
- group
Alias for field number 4
- masked
Alias for field number 3
- validphys.kinematics.all_commondata_grouping(all_commondata, metadata_group)[source]
Return a table with the grouping specified by metadata_group key for each dataset for all available commondata.
- validphys.kinematics.all_kinlimits_table(all_kinlimits, use_kinoverride: bool = True)[source]
Return a table with the kinematic limits for the datasets given as input in dataset_inputs. If the PLOTTING overrides are not used, the information on sqrt(k2) will be displayed.
- validphys.kinematics.describe_kinematics(commondata, titlelevel: int = 1)[source]
Output a markdown text describing the stored metadata for a given commondata.
titlelevel can be used to control the header level of the title.
- validphys.kinematics.kinematics_table(kinematics_table_notable)[source]
Same as kinematics_table_notable but writing the table to file
- validphys.kinematics.kinematics_table_notable(commondata, cuts, show_extra_labels: bool = False)[source]
Table containing the kinematics of a commondata object, indexed by their datapoint id. The kinematics will be tranfsormed as per the PLOTTING file of the dataset or process type, and the column headers will be the labels of the variables defined in the metadata.
If
show_extra_labels
isTrue
then extra label defined in the PLOTTING files will be displayed. Otherwise only the original three kinematics will be shown.
- validphys.kinematics.kinlimits(commondata, cuts, use_cuts, use_kinoverride: bool = True)[source]
Return a mapping containing the number of fitted and used datapoints, as well as the label, minimum and maximum value for each of the three kinematics. If
use_kinoverride
is set to False, the PLOTTING files will be ignored and the kinematics will be interpred based on the process type only. If use_cuts is ‘CutsPolicy.NOCUTS’, the information on the total number of points will be displayed, instead of the fitted ones.
validphys.lhaindex module
Created on Fri Jan 23 12:11:23 2015
@author: zah
- validphys.lhaindex.as_from_name(name)[source]
Annoying function needed because this is not in the info files. as(M_z) there is actually as(M_ref).
- validphys.lhaindex.expand_names(globstr)[source]
Return names of installed PDFs. If none is found, return names from index
- validphys.lhaindex.get_lha_datapath()[source]
Return an existing datapath from LHAPDF, starting from the end. If no path is found to exist, recover the old behaviour and returns the last path.
The check for existence intends to solve problems where a previously filled LHAPATH or LHAPDF_DATA_PATH environment variable is pointing to a non-existent path or shared systems where LHAPDF might be compiled with hard-coded paths not available to all users.
validphys.lhapdf_compatibility module
Module for LHAPDF compatibility backends
If LHAPDF is installed, the module will transparently hand over everything to LHAPDF if LHAPDF is not available, it will try to use a combination of the packages
lhapdf-management and pdfflow
which cover all the features of LHAPDF used during the fit (and likely most of validphys)
- validphys.lhapdf_compatibility.make_pdf(pdf_name, member=None)[source]
Load a PDF if member is given, load the single member otherwise, load the entire set as a list
if LHAPDF is provided, it returns LHAPDF PDF instances otherwise it returns and object which is _compatible_ with LHAPDF for lhapdf functions for the selected backend
Parameters:
- pdf_name: str
name of the PDF to load
- member: int
index of the member of the PDF to load
Returns:
list(pdf_sets)
validphys.lhapdfset module
Module containing an LHAPDF class compatible with validphys using the official lhapdf python interface.
The .members
and .central_member
of the LHAPDFSet
are
LHAPDF objects (the typical output from mkPDFs
) and can be used normally.
Examples
>>> from validphys.lhapdfset import LHAPDFSet
>>> pdf = LHAPDFSet("NNPDF40_nnlo_as_01180", "replicas")
>>> len(pdf.members)
101
>>> pdf.central_member.alphasQ(91.19)
0.11800
>>> pdf.members[0].xfxQ2(0.5, 15625)
{-5: 6.983360500601136e-05,
-4: 0.0021818063617227604,
-3: 0.00172453472243952,
-2: 0.0010906577230485718,
-1: 0.0022049272225017286,
1: 0.020051104853608722,
2: 0.0954139944889494,
3: 0.004116641378803191,
4: 0.002180124185625795,
5: 6.922722705177504e-05,
21: 0.007604124516892057}
- class validphys.lhapdfset.LHAPDFSet(name, error_type)[source]
Bases:
object
Wrapper for the lhapdf python interface.
Once instantiated this class will load the PDF set from LHAPDF. If it is a T0 set only the CV will be loaded.
- property central_member
Returns a reference to member 0 of the PDF list
- property flavors
Returns the list of accepted flavors by the LHAPDF set
- grid_values(flavors: ndarray, xgrid: ndarray, qgrid: ndarray)[source]