How to include a theory covariance matrix in a fit
This section details how to include scale variation covariance matrices (covmats) in a PDF fit. At the present time this can only be done at next-to-leading order (NLO), for which the central theory is theory 163.
First, decide which theory covmat you want
Choose the desired point-prescription listed here.
Each prescription comes with a
point_prescription
flag to include in the runcard, one of [“3 point”, “5 point”, “5bar point”, “7 point”, “9 point”]
Next, add necessary flags to the runcard
Remember to list the required datasets using
dataset_inputs
(see Data specification).Add
theorycovmatconfig
to the runcard. An example is in the following code snippet:
############################################################
theory:
theoryid: 163 # database id
theorycovmatconfig:
point_prescription: "3 point"
theoryids:
from_: scale_variation_theories
pdf: NNPDF31_nlo_as_0118
use_thcovmat_in_fitting: true
use_thcovmat_in_sampling: true
############################################################
pdf
is the PDF used to generate the scale varied predictions which construct the theory covmat. Choose something close to the PDF you are trying to fit, such as a previous iteration if available.theoryids
are necessary for the construction of the theory covmat. To avoid user error in entering them in the correct configuration and order, this is handled by theproduce_scale_variation_theories
action in config, using the information in the scalevariations module.The flags
use_thcovmat_in_fitting
anduse_thcovmat_in_sampling
specify where to use the theory covmat in the code. There are two possible places: the fitting (i.e. \(\chi^2\) minimiser) and the sampling (i.e. pseudodata generation). The default isTrue
for both.
Warning
Changing either of these to False
will affect the fit outcome and should
be avoided unless you know what you are doing.
If you want to compare data to another fit
Sometimes we want to compare data to another fit for validation, for example we might want to compare predictions for the NLO fit with MHOUs to the known NNLO fit (see Tests).
To make sure the cuts match between these two fits, edit the
datacuts
section of the runcard to include the following
use_cuts: fromintersection
cuts_intersection_spec:
- theoryid: 163
- theoryid: 53
This ensures that the cuts on the data are the intersection of the cuts in theory 53 (default NNLO) and theory 163 (central scale variation NLO). See here for theory definitions.
Example runcard
The following is an example runcard for an NLO NNPDF3.1-style fit with a 3 point theory covmat. It can be found here.
#
# Configuration file for NNPDF++
#
##########################################################################################
description: Example runcard for NLO NNPDF3.1 style fit with 3pt theory covariance matrix
##########################################################################################
# frac: training fraction
# ewk: apply ewk k-factors
# sys: systematics treatment (see systypes)
dataset_inputs:
- {dataset: NMCPD, frac: 0.5}
- {dataset: NMC, frac: 0.5}
- {dataset: SLACP, frac: 0.5}
- {dataset: SLACD, frac: 0.5}
- {dataset: BCDMSP, frac: 0.5}
- {dataset: BCDMSD, frac: 0.5}
- {dataset: CHORUSNU, frac: 0.5}
- {dataset: CHORUSNB, frac: 0.5}
- {dataset: NTVNUDMN, frac: 0.5}
- {dataset: NTVNBDMN, frac: 0.5}
- {dataset: HERACOMBNCEM, frac: 0.5}
- {dataset: HERACOMBNCEP460, frac: 0.5}
- {dataset: HERACOMBNCEP575, frac: 0.5}
- {dataset: HERACOMBNCEP820, frac: 0.5}
- {dataset: HERACOMBNCEP920, frac: 0.5}
- {dataset: HERACOMBCCEM, frac: 0.5}
- {dataset: HERACOMBCCEP, frac: 0.5}
- {dataset: HERAF2CHARM, frac: 0.5}
- {dataset: CDFZRAP, frac: 1.0}
- {dataset: D0ZRAP, frac: 1.0}
- {dataset: D0WEASY, frac: 1.0}
- {dataset: D0WMASY, frac: 1.0}
- {dataset: ATLASWZRAP36PB, frac: 1.0}
- {dataset: ATLASZHIGHMASS49FB, frac: 1.0}
- {dataset: ATLASLOMASSDY11EXT, frac: 1.0}
- {dataset: ATLASWZRAP11, frac: 0.5}
- {dataset: ATLAS1JET11, frac: 0.5}
- {dataset: ATLASZPT8TEVMDIST, frac: 0.5}
- {dataset: ATLASZPT8TEVYDIST, frac: 0.5}
- {dataset: ATLASTTBARTOT, frac: 1.0}
- {dataset: ATLASTOPDIFF8TEVTRAPNORM, frac: 1.0}
- {dataset: CMSWEASY840PB, frac: 1.0}
- {dataset: CMSWMASY47FB, frac: 1.0}
- {dataset: CMSDY2D11, frac: 0.5}
- {dataset: CMSWMU8TEV, frac: 1.0}
- {dataset: CMSZDIFF12, frac: 1.0, cfac: [NRM]}
- {dataset: CMSJETS11, frac: 0.5}
- {dataset: CMSTTBARTOT, frac: 1.0}
- {dataset: CMSTOPDIFF8TEVTTRAPNORM, frac: 1.0}
- {dataset: LHCBZ940PB, frac: 1.0}
- {dataset: LHCBZEE2FB, frac: 1.0}
- {dataset: LHCBWZMU7TEV, frac: 1.0, cfac: [NRM]}
- {dataset: LHCBWZMU8TEV, frac: 1.0, cfac: [NRM]}
############################################################
datacuts:
t0pdfset: 190310-tg-nlo-global # PDF set to generate t0 covmat
q2min: 13.96 # Q2 minimum
w2min: 12.5 # W2 minimum
combocuts: NNPDF31 # NNPDF3.0 final kin. cuts
jetptcut_tev: 0 # jet pt cut for tevatron
jetptcut_lhc: 0 # jet pt cut for lhc
wptcut_lhc: 30.0 # Minimum pT for W pT diff distributions
jetycut_tev: 1e30 # jet rap. cut for tevatron
jetycut_lhc: 1e30 # jet rap. cut for lhc
dymasscut_min: 0 # dy inv.mass. min cut
dymasscut_max: 1e30 # dy inv.mass. max cut
jetcfactcut: 1e30 # jet cfact. cut
use_cuts: fromintersection
cuts_intersection_spec:
- theoryid: 163
- theoryid: 53
############################################################
theory:
theoryid: 163 # database id
theorycovmatconfig:
point_prescription: "3 point"
theoryids:
from_: scale_variation_theories
fivetheories: None
pdf: NNPDF31_nlo_as_0118
use_thcovmat_in_fitting: true
use_thcovmat_in_sampling: true
############################################################
fitting:
seed: 65532133530 # set the seed for the random generator
genrep: on # on = generate MC replicas, off = use real data
rngalgo: 0 # 0 = ranlux, 1 = cmrg, see randomgenerator.cc
fitmethod: NGA # Minimization algorithm
ngen: 30000 # Maximum number of generations
nmutants: 80 # Number of mutants for replica
paramtype: NN
nnodes: [2, 5, 3, 1]
# NN23(QED) = sng=0,g=1,v=2,t3=3,ds=4,sp=5,sm=6,(pht=7)
# EVOL(QED) = sng=0,g=1,v=2,v3=3,v8=4,t3=5,t8=6,(pht=7)
# EVOLS(QED)= sng=0,g=1,v=2,v8=4,t3=4,t8=5,ds=6,(pht=7)
# FLVR(QED) = g=0, u=1, ubar=2, d=3, dbar=4, s=5, sbar=6, (pht=7)
fitbasis: NN31IC # EVOL (7), EVOLQED (8), etc.
basis:
# remeber to change the name of PDF accordingly with fitbasis
# pos: on for NN squared
# mutsize: mutation size
# mutprob: mutation probability
# smallx, largex: preprocessing ranges
- {fl: sng, pos: off, mutsize: [15], mutprob: [0.05], smallx: [1.046, 1.188], largex: [
1.437, 2.716]}
- {fl: g, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.9604, 1.23], largex: [
0.08459, 6.137]}
- {fl: v, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5656, 0.7242], largex: [
1.153, 2.838]}
- {fl: v3, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.1521, 0.5611], largex: [
1.236, 2.976]}
- {fl: v8, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5264, 0.7246], largex: [
0.6919, 3.198]}
- {fl: t3, pos: off, mutsize: [15], mutprob: [0.05], smallx: [-0.3687, 1.459], largex: [
1.664, 3.373]}
- {fl: t8, pos: off, mutsize: [15], mutprob: [0.05], smallx: [0.5357, 1.267], largex: [
1.433, 2.866]}
- {fl: cp, pos: off, mutsize: [15], mutprob: [0.05], smallx: [-0.09635, 1.204],
largex: [1.654, 7.456]}
############################################################
stopping:
stopmethod: LOOKBACK # Stopping method
lbdelta: 0 # Delta for look-back stopping
mingen: 0 # Minimum number of generations
window: 500 # Window for moving average
minchi2: 3.5 # Minimum chi2
minchi2exp: 6.0 # Minimum chi2 for experiments
nsmear: 200 # Smear for stopping
deltasm: 200 # Delta smear for stopping
rv: 2 # Ratio for validation stopping
rt: 0.5 # Ratio for training stopping
epsilon: 1e-6 # Gradient epsilon
############################################################
positivity:
posdatasets:
- {dataset: POSF2U, poslambda: 1e6} # Positivity Lagrange Multiplier
- {dataset: POSF2DW, poslambda: 1e6}
- {dataset: POSF2S, poslambda: 1e6}
- {dataset: POSFLL, poslambda: 1e6}
- {dataset: POSDYU, poslambda: 1e10}
- {dataset: POSDYD, poslambda: 1e10}
- {dataset: POSDYS, poslambda: 1e10}
############################################################
closuretest:
filterseed: 0 # Random seed to be used in filtering data partitions
fakedata: off # on = to use FAKEPDF to generate pseudo-data
fakepdf: MSTW2008nlo68cl # Theory input for pseudo-data
errorsize: 1.0 # uncertainties rescaling
fakenoise: off # on = to add random fluctuations to pseudo-data
rancutprob: 1.0 # Fraction of data to be included in the fit
rancutmethod: 0 # Method to select rancutprob data fraction
rancuttrnval: off # 0(1) to output training(valiation) chi2 in report
printpdf4gen: off # To print info on PDFs during minimization
############################################################
lhagrid:
nx: 150
xmin: 1e-9
xmed: 0.1
xmax: 1.0
nq: 50
qmax: 1e5
############################################################
debug: off