Specifying data cuts
The experimental CommonData
files contain more data points than we
actually fit. Some data points are excluded for reasons such as the
instability of the perturbative expansion in their corresponding
kinematic regions.
There are four possibilities for handling the experimental cuts
within validphys, which are controlled with the use_cuts
configuration setting:
use_cuts: 'nocuts'
This causes the content of the data files to be taken unmodified. Note that some theory predictions may be ill defined in this situation.
use_cuts: 'fromfit'
The cuts are read from the masks given as input to n3fit, and generated by vp-setupfit. An existing fit is required, to load the cuts, and must contain the masks for all the datasets analyzed in the active namespace.
use_cuts: 'internal'
Compute the cut masks as
vp-setupfit
would do. Currently the parametersq2min
andw2min
must be given. These can in turn be set to the same as the fit values by loading thedatacuts
namespace from the fit. In this case, the cuts will normally coincide with the ones loaded with thefromfit
setting.
use_cuts: 'fromintersection'
Compute the internal cuts as per
use_cuts: 'internal'
within each namespace in a [namespace list](#multiple-inputs-and-namespaces) calledcuts_intersection_spec
and take the intersection of the results as the cuts for the given dataset. This is useful for example for requiring the common subset of points that pass the cuts at NLO and NNLO.
use_cuts: 'fromsimilarpredictions'
Compute the intersection between two namespaces (similar to for
fromintersection
) but additionally require that the predictions computed for each dataset across the namespaces are similar, specifically that the ratio between the absolute difference in the predictions and the total experimental uncertainty is smaller than a given value,cut_similarity_threshold
that must be provided. Note that for this to work with different C-factors across the namespaces, one must provide a differentdataset_inputs
list for each.This mechanism can be ignored selectively for specific datasets. To do that, add their names to a list called
do_not_require_similarity_for
. The datasets in the list do not need to appear in thecuts_intersection_spec
namespace and will be filtered according to the internal cuts unconditionally.
The following example demonstrates the first three options:
meta:
title: Test the various options for CutsPolicy
author: Zahari Kassabov
keywords: [test, debug]
fit: NNPDF40_nlo_as_01180
theory:
from_: fit
theoryid:
from_: theory
#Load q2min and w2min from the fit
datacuts:
from_: fit
# Used for intersection cuts
cuts_intersection_spec:
- theoryid: 208
- theoryid: 162
dataset_input: {dataset: ATLASDY2D8TEV}
dataspecs:
- speclabel: "No cuts"
use_cuts: "nocuts"
- speclabel: "Fit cuts"
use_cuts: "fromfit"
- speclabel: "Internal cuts"
use_cuts: "internal"
- speclabel: "Intersected cuts"
use_cuts: "fromintersection"
template_text: |
{@with fitpdf::datacuts@}
# Plot
{@fitpdf::datacuts plot_fancy_dataspecs@}
# χ² plots
{@with dataspecs@}
## {@speclabel@}
{@plot_chi2dist@}
{@endwith@}
{@endwith@}
actions_:
- report(main=True)
Here we put together the results with the different filtering policies in a [data-theory comparison](data-theory-comp) plot and then plot the χ² distribution for each one individually. With these settings the latter three [dataspecs](#general-data-specification-the-dataspec-api) give the same result.
The following example demonstrates the use of fromsimilarpredictions:
meta:
title: "Test similarity cuts: Threshold 1,2"
author: Zahari Kassabov
keywords: [test]
show_total: True
NNLODatasts: &NNLODatasts
- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N
NLODatasts: &NLODatasts
- {dataset: ATLAS_SINGLETOP_7TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_13TEV_TCHANNEL-XSEC, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_7TEV_T-Y-NORM, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_7TEV_TBAR-Y-NORM, frac: 1.0, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_8TEV_T-RAP-NORM, frac: 0.75, variant: legacy} # N
- {dataset: ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM, frac: 0.75, variant: legacy} # N
do_not_require_similarity_for: [ATLAS_SINGLETOP_8TEV_TBAR-RAP-NORM]
dataset_inputs: *NLODatasts
cuts_intersection_spec:
- theoryid: 208
pdf: NNPDF40_nlo_as_01180
dataset_inputs: *NLODatasts
- theoryid: 200
pdf: NNPDF40_nnlo_as_01180
dataset_inputs: *NNLODatasts
theoryid: 208
pdf: NNPDF40_nlo_as_01180
dataspecs:
- use_cuts: internal
speclabel: "No cuts"
- cut_similarity_threshold: 2
speclabel: "Threshold 2"
use_cuts: fromsimilarpredictions
- cut_similarity_threshold: 1
speclabel: "Threshold 1"
use_cuts: fromsimilarpredictions
template_text: |
{@dataspecs_chi2_table@}
actions_:
- report(main=True)