.. _n3fit-usage: How to run a PDF fit ==================== The user should perform the steps documented below in order to obtain a complete PDF fit using the latest release of the NNPDF fitting code: ``n3fit``. The fitting methodology is detailed in :ref:`methodology`. The three main points in this tutorial are: - :ref:`Preparing a fit runcard <prepare-fits>` - :ref:`Running the fitting code <run-n3fit-fit>` - :ref:`Upload and analyse the fit <upload-fit>` - :ref:`advance-run-fit` .. _prepare-fits: Preparing a fit runcard ----------------------- The runcard is written in YAML. The runcard is the unique identifier of a fit and contains all required information to perform and reproduce a fit, which includes the experimental data, the theory setup and the fitting setup. A detailed explanation on the parameters accepted by the ``n3fit`` runcards can be found in the :ref:`n3fit detailed guide <runcard-detailed>`. For newcomers, it is recommended to start from an already existing runcard, example runcards (and runcard used in NNPDF releases) are available at `n3fit/runcards <https://github.com/NNPDF/nnpdf/tree/master/n3fit/runcards>`_. .. note:: While we aim for the code to be both backwards and forwards compatible with respect to runcards, by setting sensible defaults when introducing new features, make sure that you are using a runcard tagged with the same version of the code you are using to avoid any surprises. The runcards are mostly self explanatory, see for instance below an example of the ``parameter`` dictionary that defines the Machine Learning methodology. .. code:: yaml # runcard example ... parameters: nodes_per_layer: [15, 10, 8] activation_per_layer: ['sigmoid', 'sigmoid', 'linear'] initializer: 'glorot_normal' optimizer: optimizer_name: 'RMSprop' learning_rate: 0.01 clipnorm: 1.0 epochs: 900 positivity: multiplier: 1.05 threshold: 1e-5 stopping_patience: 0.30 # Ratio of the number of epochs layer_type: 'dense' dropout: 0.0 ... The runcard system is designed such that the user can utilize the program without having to tinker with the codebase. One can simply modify the options in ``parameters`` to specify the desired architecture of the Neural Network as well as the settings for the optimization algorithm. .. _run-n3fit-fit: Running the fitting code ------------------------ After successfully installing the ``n3fit`` package and preparing a runcard following the points presented above you can proceed with a fit. 1. Prepare the fit using ``vp-setupfit``. This command will generate a folder with the same name as the runcard (minus the file extension) in the current directory, which will contain a copy of the original YAML runcard. The required resources will be downloaded, which includes: - The t0 PDF set (an LHAPDF object). - The FastKernel tables in the form of a ``theory_xxx.tgz`` file - The postfit evolution operator ``EKO.tar``. If the runcard requires to precompute some heavy objects shared among replicas, such as the theory covariance matrix, it will be done during this step. :: vp-setupfit <runcard>.yml 2. Run the fit using ``n3fit``. The ``n3fit`` program takes a ``runcard.yml`` as input and a replica number, e.g. :code:`n3fit runcard.yml replica` where ``replica`` goes from 1-n where n is the maximum number of desired replicas. Note that if you desire, for example, a 100 replica fit you should launch more than 100 replicas (e.g. 120) since not all of them will necessarily converge. While by default the code runs each replica separately, it is possible to run many replicas in parallel, see :ref:`parallel-label`. :: for i in {1..120} ; do n3fit <runcard>.yml $i done 3. Once all replicas have finished, you need to run the ``evolven3fit`` program in order to evolve the PDF from the fitting scale to the whole range of scales needed to create an LHAPDF grid. This is done using the EKO library to perform DGLAP evolution. :: evolven3fit evolve <runcard> 4. Finally, use ``postfit`` to finalize the PDF set by applying post selection criteria and compute the central replica. This will produce a set of ``number_of_replicas`` error replicas and one mean replica for a total of ``number_of_replicas+1``. The number of replicas should be that which you desire in the final fit (e.g., 100). Note that the standard behaviour of ``postfit`` can be modified by using various flags. More information can be found at :ref:`Processing a fit <postfit>`. :: postfit <number of desired replicas> <runcard> Output of the fit ~~~~~~~~~~~~~~~~~ Every time a replica is finalized, the output is saved to the ```runcard/nnfit/replica_$replica``` folder, which contains a number of files: - ``chi2exps.log``: a json log file with the χ² of the training every 100 epochs. - ``runcard.exportgrid``: a file containing the PDF grid. - ``runcard.json``: Includes information about the fit (metadata, parameters, times) in json format. .. note:: The reported χ² refers always to the actual χ², i.e., without positivity loss or other penalty terms. .. _upload-fit: Upload and analyse the fit -------------------------- After obtaining the fit you can proceed with the fit upload and analysis by: 1. *For members of NNPDF*, it is possible to upload the results to the nnpdf server using ``vp-upload runcard_folder`` then install the fitted set with ``vp-get fit fit_name``. Otherwise, copy or link to the results to the ``share/NNPDF/results/`` folder (usually under ``~/.local/share`` or ``${CONDA_PREFIX}/share/``. 2. It is recommended to iterate a fit to achieve a higher degree of convergence/stability in the fit. To read more about this, see :ref:`How to run an iterated fit <run-iterated-fit>`. 3. Analysing the results with ``validphys``, see the :ref:`vp-guide <vp-index>`. Consider using the ``vp-comparefits`` tool. .. _advance-run-fit: Advanced topics --------------- Fit performance ~~~~~~~~~~~~~~~ The ``n3fit`` framework is currently based on `Keras <https://keras.io/>`_ and it is tested to run with the `Tensorflow <https://www.tensorflow.org/>`_ and `pytorch <https://pytorch.org>`_ backends. This also means that anything that make any of these packages faster will also make ``n3fit`` faster. Note that at the time of writing, ``TensorFlow`` is approximately 4 times faster than ``pytorch``. The default backend for ``keras`` is ``tensorflow``. In order to change the backend, the environment variable ``KERAS_BACKENDD`` need to be set (e.g., ``KERAS_BACKEND=torch``). The best results are obtained with ``tensorflow[and-cuda]`` installed from pip and running ``n3fit`` in GPU, see :ref:`parallel-label`. QED fit ~~~~~~~ In order to run a QED fit see :ref:`How to run a QED fit <run-qed-fit>`. Hyperparameter optimization ~~~~~~~~~~~~~~~~~~~~~~~~~~~ An important feature of ``n3fit`` is the ability to perform :ref:`hyperparameter scans <hyperoptimization>`, for this we have also introduced a ``hyperscan_config`` key which specifies the trial ranges for the hyperparameter scan procedure. See the following self-explanatory example: .. code:: yaml hyperscan_config: stopping: # setup for stopping scan min_epochs: 5e2 # minimum number of epochs max_epochs: 40e2 # maximum number of epochs min_patience: 0.10 # minimum stop patience max_patience: 0.40 # maximum stop patience positivity: # setup for the positivity scan min_multiplier: 1.04 # minimum lagrange multiplier coeff. max_multiplier: 1.1 # maximum lagrange multiplier coeff. min_initial: 1.0 # minimum initial penalty max_initial: 5.0 # maximum initial penalty optimizer: # setup for the optimizer scan - optimizer_name: 'Adadelta' learning_rate: min: 0.5 max: 1.5 - optimizer_name: 'Adam' learning_rate: min: 0.5 max: 1.5 architecture: # setup for the architecture scan initializers: 'ALL' # Use all implemented initializers from keras max_drop: 0.15 # maximum dropout probability n_layers: [2,3,4] # number of layers min_units: 5 # minimum number of nodes max_units: 50 # maximum number of nodes activations: ['sigmoid', 'tanh'] # list of activation functions It is also possible to take the configuration of the hyperparameter scan from a previous run in the NNPDF server by using the key ``from_hyperscan``: .. code:: yaml hyperscan_config: from_hyperscan: 'some_previous_hyperscan' or to directly take the trials from said hyperscan: .. code:: yaml hyperscan_config: use_tries_from: 'some_previous_hyperscan' If you are planning to perform a hyperparameter scan just perform exactly the same steps as in :ref:`run-n3fit-fit` by adding the ``--hyperopt number_of_trials`` argument to ``n3fit``, where ``number_of_trials`` is the maximum allowed value of trials required by the fit. Usually when running hyperparameter scan we switch-off the MC replica generation so different replicas will correspond to different initial points for the scan, this approach provides faster results. We provide the ``vp-hyperoptplot`` script to analyse the output of the hyperparameter scan.