n3fit package

Subpackages

Submodules

n3fit.checks module

This module contains checks to be perform by n3fit on the input

n3fit.checks.can_run_multiple_replicas(replicas, parallel_models)[source]

Warns the user if trying to run just one replica in parallel

n3fit.checks.check_basis_with_layers(basis, parameters)[source]

Check that the last layer matches the number of flavours defined in the runcard

n3fit.checks.check_consistent_basis(sum_rules, fitbasis, basis, theoryid)[source]

Checks the fitbasis setup for inconsistencies - Checks the sum rules can be imposed - Correct flavours for the selected basis - Correct ranges (min < max) for the small and large-x exponents

n3fit.checks.check_consistent_layers(parameters)[source]

Checks that all layers have an activation function defined

n3fit.checks.check_consistent_parallel(parameters, parallel_models, same_trvl_per_replica)[source]

Checks whether the multiple-replica fit options are consistent among them i.e., that the trvl seed is fixed and the layer type is correct

n3fit.checks.check_correct_partitions(kfold, data)[source]

Ensures that all experimennts in all partitions are included in the fit definition

n3fit.checks.check_deprecated_options(fitting)[source]

Checks whether the runcard is using deprecated options

n3fit.checks.check_dropout(parameters)[source]

Checks the dropout setup (positive and smaller than 1.0)

n3fit.checks.check_existing_parameters(parameters)[source]

Check that non-optional parameters are defined and are not empty

n3fit.checks.check_fiatlux_pdfs_id(replicas, fiatlux, replica_path)[source]
n3fit.checks.check_hyperopt_architecture(architecture)[source]

Checks whether the scanning setup for the NN architecture works - Initializers are valid - Dropout setup is valid - No ‘min’ is greater than its corresponding ‘max’

n3fit.checks.check_hyperopt_positivity(positivity_dict)[source]

Checks that the positivity multiplier and initial values are sensible and valid

n3fit.checks.check_hyperopt_stopping(stopping_dict)[source]

Checks that the options selected for the stopping are consistent

n3fit.checks.check_initializer(initializer)[source]

Checks whether the initializer is implemented

n3fit.checks.check_kfold_options(kfold)[source]

Warns the user about potential bugs on the kfold setup

n3fit.checks.check_lagrange_multipliers(parameters, key)[source]

Checks the parameters in a lagrange multiplier dictionary are correct, e.g. for positivity and integrability

n3fit.checks.check_model_file(save, load)[source]

Checks whether the model_files given in the runcard are acceptable

n3fit.checks.check_optimizer(optimizer_dict)[source]

Checks whether the optimizer setup is valid

n3fit.checks.check_stopping(parameters)[source]

Checks whether the stopping-related options are sane: stopping patience as a ratio between 0 and 1 and positive number of epochs

n3fit.checks.check_sumrules(sum_rules)[source]

Checks that the chosen option for the sum rules are sensible

n3fit.checks.check_tensorboard(tensorboard)[source]

Check that the tensorbard callback can be enabled correctly

n3fit.checks.wrapper_check_NN(basis, tensorboard, save, load, parameters)[source]

Wrapper function for all NN-related checks

n3fit.checks.wrapper_hyperopt(hyperopt, hyperscan_config, kfold, data)[source]

Wrapper function for all hyperopt-related checks No check is performed if hyperopt is not active

n3fit.model_gen module

n3fit.model_trainer module

n3fit.msr module

n3fit.n3fit_checks_provider module

This module contains a checks provider to be used by n3fit apps

n3fit.n3fit_checks_provider.n3fit_checks_action(*, genrep, data, theoryid, basis, fitbasis, sum_rules=True, parameters, save=None, load=None, hyperscan_config=None, hyperopt=None, kfold=None, tensorboard=None, parallel_models=False, same_trvl_per_replica=False)[source]

n3fit.performfit module

Fit action controller

n3fit.performfit.performfit(*, n3fit_checks_action, replicas, replicas_nnseed_fitting_data_dict, posdatasets_fitting_pos_dict, integdatasets_fitting_integ_dict, theoryid, fiatlux, basis, fitbasis, sum_rules=True, parameters, replica_path, output_path, save=None, load=None, hyperscanner=None, hyperopt=None, kfold_parameters, tensorboard=None, debug=False, maxcores=None, parallel_models=False)[source]

This action will (upon having read a validcard) process a full PDF fit for a set of replicas.

The input to this function is provided by validphys and/or defined in the runcards or commandline arguments.

This controller is provided with: 1. Seeds generated using the replica number and the seeds defined in the runcard. 2. Loaded datasets with replicas generated.

2.1 Loaded positivity/integrability sets.

The workflow of this controller is as follows: 1. Generate a ModelTrainer object holding information to create the NN and perform a fit

(at this point no NN object has been generated) 1.1 (if hyperopt) generates the hyperopt scanning dictionary

taking as a base the fitting dictionary and the runcard’s hyperscanner dictionary

  1. Pass the dictionary of parameters to ModelTrainer

    for the NN to be generated and the fit performed

    2.1 (if hyperopt) Loop over point 4 for hyperopt number of times

  2. Once the fit is finished, output the PDF grid and accompanying files

Parameters:
  • genrep (bool) – Whether or not to generate MC replicas. (Only used for checks)

  • data (validphys.core.DataGroupSpec) – containing the datasets to be included in the fit. (Only used for checks)

  • replicas_nnseed_fitting_data_dict (list[tuple]) – list with element for each replica (typically just one) to be fitted. Each element is a tuple containing the replica number, nnseed and fitted_data_dict containing all of the data, metadata for each group of datasets which is to be fitted.

  • posdatasets_fitting_pos_dict (list[dict]) – list of dictionaries containing all data and metadata for each positivity dataset

  • integdatasets_fitting_integ_dict (list[dict]) – list of dictionaries containing all data and metadata for each integrability dataset

  • theoryid (validphys.core.TheoryIDSpec) – Theory which is used to generate theory predictions from model during fit. Object also contains some metadata on the theory settings.

  • fiatlux (dict) – dictionary containing the params needed from LuxQED

  • basis (list[dict]) – preprocessing information for each flavour to be fitted.

  • fitbasis (str) – Valid basis which the fit is to be ran in. Available bases can be found in validphys.pdfbases.

  • sum_rules (bool) – Whether to impose sum rules in fit. By default set to True

  • parameters (dict) – Mapping containing parameters which define the network architecture/fitting methodology.

  • replica_path (pathlib.Path) – path to the output of this run

  • output_path (str) – name of the fit

  • save (None, str) – model file where weights will be saved, used in conjunction with load.

  • load (None, str) – model file from which to load weights from.

  • hyperscanner (dict) – dictionary containing the details of the hyperscanner

  • hyperopt (int) – if given, number of hyperopt iterations to run

  • kfold_parameters (None, dict) – dictionary with kfold settings used in hyperopt.

  • tensorboard (None, dict) – mapping containing tensorboard settings if it is to be used. By default it is None and tensorboard is not enabled.

  • debug (bool) – activate some debug options

  • maxcores (int) – maximum number of (logical) cores that the backend should be aware of

  • parallel_models (bool) – whether to run models in parallel

n3fit.scaler module

n3fit.scaler.generate_scaler(input_list: List[ndarray[Any, dtype[ScalarType]]], interpolation_points: Optional[int] = None) Callable[source]

Generate the scaler function that applies feature scaling to the input data.

Parameters:
  • input_list (list of numpy.ndarray) – The list of input data arrays.

  • interpolation_points (int, optional) –

Returns:

_scaler – The scaler function that applies feature scaling to the input data.

Return type:

Callable

n3fit.stopping module

Module containing the classes related to the stopping alogirthm

In this module there are four Classes:

  • FitState: this class contains the information of the fit

    for a given point in history

  • FitHistory: this class contains the information necessary

    in order to reset the state of the fit to the point in which the history was saved. i.e., a list of FitStates

  • Stopping: this class monitors the chi2 of the validation

    and training sets and decides when to stop

  • Positivity: Decides whether a given point fullfills the positivity conditions

  • Validation: Controls the NNPDF cross-validation algorithm

Note

There are situations in which the validation set is empty, in those cases

the training set is used as validation set. This implies several changes in the behaviour of this class as the training chi2 will now be monitored for stability.

In order to parse the set of loss functions coming from the backend::MetaModel,

the function parse_losses relies on the fact that they are all suffixed with _loss the validation case, instead, is suffixed with val_loss. In the particular casse in which both training and validation model correspond to the same backend::MetaModel only the _loss suffix can be found. This is taken into account by the class Stopping which will tell Validation that no validation set was found and that the training is to be used instead.

class n3fit.stopping.FitHistory(tr_ndata, vl_ndata)[source]

Bases: object

Keeps a list of FitState items holding the full chi2 history of the fit.

Parameters:
  • tr_ndata (dict) – dictionary of {dataset: n_points} for the training data

  • vl_ndata (dict) – dictionary of {dataset: n_points} for the validation data

get_state(epoch)[source]

Get the FitState of the system for a given epoch

register(epoch, training_info, validation_info)[source]

Save a new fitstate and updates the current final epoch

Parameters:
  • epoch (int) – the current epoch of the fit

  • training_info (dict) – all losses for the training model

  • validation_info (dict) – all losses for the validation model

Return type:

FitState

class n3fit.stopping.FitState(training_info, validation_info)[source]

Bases: object

Holds the state of the chi2 during the fit, for all replicas and one epoch

Note: the training chi2 is computed before the update of the weights so it is the chi2 that informed the updated corresponding to this state. The validation chi2 instead is computed after the update of the weights.

Parameters:
  • training_info (dict) – all losses for the training model

  • validation_info (dict) – all losses for the validation model

property all_tr_chi2
all_tr_chi2_for_replica(i_replica)[source]

Return the tr chi2 per dataset for a given replica

property all_vl_chi2
all_vl_chi2_for_replica(i_replica)[source]

Return the vl chi2 per dataset for a given replica

total_partial_tr_chi2()[source]

Return the tr chi2 summed over replicas per experiment

total_partial_vl_chi2()[source]

Return the vl chi2 summed over replicas per experiment

total_tr_chi2()[source]

Return the total tr chi2 summed over replicas

total_vl_chi2()[source]

Return the total vl chi2 summed over replicas

property tr_chi2
property tr_loss

Return the total validation loss as it comes from the info dictionaries

tr_ndata = None
property vl_chi2
property vl_loss

Return the total validation loss as it comes from the info dictionaries

vl_ndata = None
vl_suffix = None
class n3fit.stopping.Positivity(threshold, positivity_sets)[source]

Bases: object

Controls the positivity requirements.

In order to check the positivity passes will check the history of the fitting as the fitting included positivity sets. If the sum of all positivity sets losses is above a certain value the model is not accepted and the training continues.

Parameters:
  • threshold_positivity (float) – maximum value allowed for the sum of all positivity losses

  • positivity_sets (list) – list of positivity datasets

check_positivity(history_object)[source]

This function receives a history objects and loops over the positivity_sets to check the value of the positivity loss.

If the positivity loss is above the threshold, the positivity fails otherwise, it passes. It returns an array booleans which are True if positivity passed

story_object[key_loss] < self.threshold
history_object: dict

dictionary of entries in the form {‘name’: loss}, output of a MetaModel .fit()

class n3fit.stopping.Stopping(validation_model, all_data_dicts, pdf_models, threshold_positivity=1e-06, total_epochs=0, stopping_patience=7000, threshold_chi2=10.0, dont_stop=False)[source]

Bases: object

Driver of the stopping algorithm

Note, if the total number of points in the validation dictionary is None, it is assumed the validation_model actually corresponds to the training model.

Parameters:
  • validation_model (n3fit.backends.MetaModel) – the model with the validation mask applied (and compiled with the validation data and covmat)

  • all_data_dicts (dict) – list containg all dictionaries containing all information about the experiments/validation/regularizers/etc to be parsed by Stopping

  • pdf_models (list(n3fit.backends.MetaModel)) – list of pdf_models being trained

  • threshold_positivity (float) – maximum value allowed for the sum of all positivity losses

  • total_epochs (int) – total number of epochs

  • stopping_patience (int) – how many epochs to wait for the validation loss to improve

  • threshold_chi2 (float) – maximum value allowed for chi2

  • dont_stop (bool) – dont care about early stopping

chi2exps_json(i_replica=0, log_each=100)[source]

Returns and apt-for-json dictionary with the status of the fit every log_each epochs

Parameters:
  • i_replica (int) – which replica are we writing the log for

  • log_each (int) – every how many epochs to print the log

Returns:

file_list – a list of strings to be printed as chi2exps.log

Return type:

list(str)

property e_best_chi2

Epoch of the best chi2, if there is no best epoch, return last

evaluate_training(training_model)[source]

Given the training model, evaluates the model and parses the chi2 of the training datasets

Parameters:

training_model (n3fit.backends.MetaModel) – an object implementing the evaluate function

Returns:

tr_chi2 – chi2 of the given training_model

Return type:

float

make_stop()[source]

Convenience method to set the stop_now flag and reload the history to the point of the best model if any

monitor_chi2(training_info, epoch, print_stats=False)[source]

Function to be called at the end of every epoch. Stores the total chi2 of the training set as well as the total chi2 of the validation set. If the training chi2 is below a certain threshold, stores the state of the model which gave the minimum chi2 as well as the epoch in which occurred If the epoch is a multiple of save_all_each then we also save the per-exp chi2

Returns True if the run seems ok and False if a NaN is found

Parameters:
  • training_info (dict) – output of a .fit() call, dictionary of the total loss (summed over replicas) for each experiment

  • epoch (int) – index of the epoch

Returns:

pass_ok – true/false according to the status of the run

Return type:

bool

property positivity_status

Returns POS_PASS if positivity passes or veto if it doesn’t for each replica

print_current_stats(epoch, fitstate)[source]

Prints fitstate training and validation chi2s

property stop_epoch

Epoch in which the fit is stopped

stop_here()[source]

Returns the stopping status If dont_stop is set returns always False (i.e., never stop)

property vl_chi2

Current validation chi2

n3fit.stopping.parse_losses(history_object, data, suffix='loss')[source]

Receives an object containing the chi2 Usually a history object, but it can come in the form of a dictionary.

It loops over the dictionary and uses the npoints_data dictionary to normalize the chi2 and return backs a tuple (total, tr_chi2)

Parameters:
  • history_object (dict) – A history object dictionary

  • data (dict) – dictionary with the name of the experiments to be taken into account and the number of datapoints of the experiments

  • suffix (str (default: loss)) – suffix of the loss layer, Keras default is _loss

Returns:

  • total_loss (float) – Total value for the loss

  • dict_chi2 (dict) – dictionary of {‘expname’ : loss }

n3fit.stopping.parse_ndata(all_data)[source]

Parses the list of dictionaries received from ModelTrainer into a dictionary containing only the name of the experiments together with the number of points.

Returns:

  • tr_ndata – dictionary of {‘exp’ : ndata}

  • vl_ndata – dictionary of {‘exp’ : ndata}

  • `pos_set` (list of the names of the positivity sets)

Note: if there is no validation (total number of val points == 0) then vl_ndata will point to tr_ndata

n3fit.stopwatch module

StopWatch module for computing the time performance of n3fit

class n3fit.stopwatch.StopWatch[source]

Bases: object

This class works as a stopwatch, upon initialization it will register the initialization time as start and times can be register by running the .register_times(tag) method.

When the stopwatchn is stopped (with the .stop() method) it will generate two dictionaries with the relative times between every register time and the starting point.

get_times(tag=None)[source]

Return a tuple with the tag time of the watch defaults to the starting time

Parameters:

tag – if none, defaults to start_key

Return type:

(tag cpu time, tag wall time)

register_ref(tag, reference)[source]

Register an event named tag and register a request to compute also the time difference between this event and reference

register_times(tag)[source]

Register an event named tag

start_key = 'start'
stop()[source]

Stops the stopwatch and create the output dictionary

Returns:

- `dict_out` – with all relatives cpu and walltimes

Return type:

a dictionary containing two dictionaries

n3fit.stopwatch.get_time()[source]

Returns the cputime and walltime Note: only relative times make sense

n3fit.version module

n3fit.vpinterface module

n3fit interface to validphys

Example

>>> import numpy as np
>>> from n3fit.vpinterface import N3PDF
>>> from n3fit.model_gen import pdfNN_layer_generator
>>> from validphys.pdfgrids import xplotting_grid
>>> fake_fl = [{'fl' : i, 'largex' : [0,1], 'smallx': [1,2]} for i in ['u', 'ubar', 'd', 'dbar', 'c', 'cbar', 's', 'sbar']]
>>> fake_x = np.linspace(1e-3,0.8,3)
>>> pdf_model = pdfNN_layer_generator(nodes=[8], activations=['linear'], seed=0, flav_info=fake_fl)
>>> n3pdf = N3PDF(pdf_model)
>>> res = xplotting_grid(n3pdf, 1.6, fake_x)
>>> res.grid_values.error_members().shape
(1, 8, 3)
class n3fit.vpinterface.N3LHAPDFSet(name, pdf_models, Q=1.65)[source]

Bases: LHAPDFSet

Extension of LHAPDFSet using n3fit models

grid_values(flavours, xarr, qmat=None)[source]
Parameters:
  • flavours (numpy.ndarray) – flavours to compute

  • xarr (numpy.ndarray) – x-points to compute, dim: (xgrid_size,)

  • qmat (numpy.ndarray) – q-points to compute (not used by n3fit, used only for shaping purposes)

Returns:

  • numpy.ndarray

  • array of shape (replicas, flavours, xgrid_size, qmat) with the values of – the pdf_model``(s) evaluated in ``xarr

xfxQ(x, Q, n, fl)[source]

Return the value of the PDF member for the given value in x

class n3fit.vpinterface.N3PDF(pdf_models, fit_basis=None, name='n3fit', Q=1.65)[source]

Bases: PDF

Creates a N3PDF object, extension of the validphys PDF object to perform calculation with a n3fit generated model.

Parameters:
  • pdf_models (n3fit.backends.MetaModel (or list thereof)) – PDF trained with n3fit, x -> f(x)_{i} where i are the flavours in the evol basis

  • fit_basis (list(dict)) – basis of the training, used for reporting

  • name (str) – name of the N3PDF object

get_nn_weights()[source]

Outputs all weights of the NN as numpy.ndarrays

get_preprocessing_factors(replica=None)[source]

Loads the preprocessing alpha and beta arrays from the PDF trained model. If a fit_basis given in the format of n3fit runcards is given it will be used to generate a new dictionary with the names, the exponent and whether they are trainable otherwise outputs a Nx2 array where [:,0] are alphas and [:,1] betas

load()[source]

If the function needs an LHAPDF object, return a N3LHAPDFSet

class n3fit.vpinterface.N3Stats(data)[source]

Bases: MCStats

The PDFs from n3fit are MC PDFs however, since there is no grid, the CV has to be computed manually

central_value()[source]
error_members()[source]
n3fit.vpinterface.compute_arclength(self, q0=1.65, basis='evolution', flavours=None)[source]

Given the layer with the fit basis computes the arc length using the corresponding validphys action

Parameters:
  • pdf_function (function) – pdf function has received by the writer or pdf_model

  • q0 (float) – energy at which the arc length is computed

  • basis (str) – basis in which to compute the arc length

  • flavours (list) – output flavours

Example

>>> from n3fit.vpinterface import N3PDF, compute_arclength
>>> from n3fit.model_gen import pdfNN_layer_generator
>>> fake_fl = [{'fl' : i, 'largex' : [0,1], 'smallx': [1,2]} for i in ['u', 'ubar', 'd', 'dbar', 'c', 'g', 's', 'sbar']]
>>> pdf_model = pdfNN_layer_generator(nodes=[8], activations=['linear'], seed=0, flav_info=fake_fl, fitbasis="FLAVOUR")
>>> n3pdf = N3PDF(pdf_model)
>>> res = compute_arclength(n3pdf)
n3fit.vpinterface.integrability_numbers(n3pdf, q0=1.65, flavours=None)[source]

Compute the integrability numbers for the current PDF using the corresponding validphys action

Parameters:
  • q0 (float) – energy at which the integrability is computed

  • flavours (list) – flavours for which the integrability is computed

Returns:

Value for the integrability for each of the flavours

Return type:

np.array(float)

Example

>>> from n3fit.vpinterface import N3PDF, integrability_numbers
>>> from n3fit.model_gen import pdfNN_layer_generator
>>> fake_fl = [{'fl' : i, 'largex' : [0,1], 'smallx': [1,2]} for i in ['u', 'ubar', 'd', 'dbar', 'c', 'g', 's', 'sbar']]
>>> pdf_model = pdfNN_layer_generator(nodes=[8], activations=['linear'], seed=0, flav_info=fake_fl, fitbasis="FLAVOUR")
>>> n3pdf = N3PDF(pdf_model)
>>> res = integrability_numbers(n3pdf)

Module contents