Specifying data cuts -------------------- The experimental ``CommonData`` files contain more data points than we actually fit. Some data points are excluded for reasons such as the instability of the perturbative expansion in their corresponding kinematic regions. There are four possibilities for handling the experimental cuts within validphys, which are controlled with the ``use_cuts`` configuration setting: ``use_cuts: 'nocuts'`` * This causes the content of the data files to be taken unmodified. Note that some theory predictions may be ill defined in this situation. ``use_cuts: 'fromfit'`` * The cuts are read from the masks given as input to [``n3fit``](../n3fit/index.html), and generated by [``vp-setupfit``](scripts.html). An existing fit is required, to load the cuts, and must contain the masks for all the datasets analyzed in the active namespace. ``use_cuts: 'internal'`` * Compute the cut masks as ``vp-setupfit`` would do. Currently the parameters ``q2min`` and ``w2min`` must be given. These can in turn be set to the same as the fit values by loading the ``datacuts`` namespace from the fit. In this case, the cuts will normally coincide with the ones loaded with the ``fromfit`` setting. ``use_cuts: 'fromintersection'`` * Compute the internal cuts as per ``use_cuts: 'internal'`` within each namespace in a [namespace list](#multiple-inputs-and-namespaces) called ``cuts_intersection_spec`` and take the intersection of the results as the cuts for the given dataset. This is useful for example for requiring the common subset of points that pass the cuts at NLO and NNLO. ``use_cuts: 'fromsimilarpredictions'`` * Compute the intersection between two namespaces (similar to for ``fromintersection``) but additionally require that the predictions computed for each dataset across the namespaces are *similar*, specifically that the ratio between the absolute difference in the predictions and the total experimental uncertainty is smaller than a given value, ``cut_similarity_threshold`` that must be provided. Note that for this to work with different C-factors across the namespaces, one must provide a different ``dataset_inputs`` list for each. * This mechanism can be ignored selectively for specific datasets. To do that, add their names to a list called ``do_not_require_similarity_for``. The datasets in the list do not need to appear in the ``cuts_intersection_spec`` namespace and will be filtered according to the internal cuts unconditionally. The following example demonstrates the first three options: ```yaml meta: title: Test the various options for CutsPolicy author: Zahari Kassabov keywords: [test, debug] fit: NNPDF40_nlo_as_01180 theory: from_: fit theoryid: from_: theory #Load q2min and w2min from the fit datacuts: from_: fit # Used for intersection cuts cuts_intersection_spec: - theoryid: 208 - theoryid: 162 dataset_input: {dataset: ATLASDY2D8TEV} dataspecs: - speclabel: "No cuts" use_cuts: "nocuts" - speclabel: "Fit cuts" use_cuts: "fromfit" - speclabel: "Internal cuts" use_cuts: "internal" - speclabel: "Intersected cuts" use_cuts: "fromintersection" template_text: | {@with fitpdf::datacuts@} # Plot {@fitpdf::datacuts plot_fancy_dataspecs@} # χ² plots {@with dataspecs@} ## {@speclabel@} {@plot_chi2dist@} {@endwith@} {@endwith@} actions_: - report(main=True) ``` Here we put together the results with the different filtering policies in a [data-theory comparison](data-theory-comp) plot and then plot the χ² distribution for each one individually. With these settings the latter three [dataspecs](#general-data-specification-the-dataspec-api) give the same result. The following example demonstrates the use of `fromsimilarpredictions`: ```yaml meta: title: "Test similarity cuts: Threshold 1,2" author: Zahari Kassabov keywords: [test] show_total: True NNLODatasts: &NNLODatasts - {dataset: ATLAS_SINGLETOP_TCH_R_7TEV, frac: 1.0, cfac: [QCD]} # N - {dataset: ATLAS_SINGLETOP_TCH_R_13TEV, frac: 1.0, cfac: [QCD]} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP_NORM, frac: 1.0, cfac: [QCD]} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP_NORM, frac: 1.0, cfac: [QCD]} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP_NORM, frac: 0.75, cfac: [QCD]} # N NLODatasts: &NLODatasts - {dataset: ATLAS_SINGLETOP_TCH_R_7TEV, frac: 1.0, cfac: []} # N - {dataset: ATLAS_SINGLETOP_TCH_R_13TEV, frac: 1.0, cfac: []} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_7TEV_T_RAP_NORM, frac: 1.0, cfac: []} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_7TEV_TBAR_RAP_NORM, frac: 1.0, cfac: []} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_8TEV_T_RAP_NORM, frac: 0.75, cfac: []} # N - {dataset: ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP_NORM, frac: 0.75, cfac: []} # N do_not_require_similarity_for: [ATLAS_SINGLETOP_TCH_DIFF_8TEV_TBAR_RAP_NORM] dataset_inputs: *NLODatasts cuts_intersection_spec: - theoryid: 208 pdf: NNPDF40_nlo_as_01180 dataset_inputs: *NLODatasts - theoryid: 200 pdf: NNPDF40_nnlo_as_01180 dataset_inputs: *NNLODatasts theoryid: 208 pdf: NNPDF40_nlo_as_01180 dataspecs: - use_cuts: internal speclabel: "No cuts" - cut_similarity_threshold: 2 speclabel: "Threshold 2" use_cuts: fromsimilarpredictions - cut_similarity_threshold: 1 speclabel: "Threshold 1" use_cuts: fromsimilarpredictions template_text: | {@dataspecs_chi2_table@} actions_: - report(main=True) ```