/MLP

Multi-Layer Perceptron (MLP).

Neural Networks in general, and in particular the Multi-Layer Perceptron (MLP) are now very widely used in several fields, for example:

- in industry for automatic process control, quality control, optimisation of resources allocation. - for medical images analysis and help to diagnosis. - in meteorology for weather forecast. - ...

In Particle Physics, they are commonly used, mainly for off-line classification tasks (particle identification, event classification, search for new physics). They are also used for track reconstruction or for online triggering.

The Multi-layer perceptron is the most widely used type of neural network. It is both simple and based on solid mathematical grounds. Input quantities are processed through successive layers of "neurons". There is always an input layer, with a number of neurons equal to the number of variables of the problem, and an output layer, where the Perceptron response is made available, with a number of neurons equal to the desired number of quantities computed from the inputs (very often only one). The layers in between are called "hidden" layers. With no hidden layer, the perceptron can only perform linear tasks (for example a linear discriminant analysis, which is already useful). All problems which can be solved by a Perceptron can be solved with only hidden layer, but it is sometimes more efficient to use 2 hidden layers. Each neuron of a layer other than the input layer computes first a linear combination of the outputs of the neurons of the previous layer, plus a bias. The coefficients of the linear combinations plus the biases are called the weights. They are usually determined from examples to minimise, on the set of examples, the (Euclidian) norm of the desired output - net output vector.Neurons in the hidden layer then compute a non-linear function of their input. In MLPfit, the non-linear function is the sigmoid function y(x) = 1/(1+exp(-x))). The output neuron(s) has its output equal to the linear combination. Thus, a Multi-Layer Perceptron with 1 hidden layer basically performs a linear combination of sigmoid function of the inputs. A linear combination of sigmoids is useful because of the two following theorems:

- a linear function of sigmoids can approximate any continuous function of one or more variable(s). This is useful to obtain a continuous function fitting a finite set of points when no underlying model is available. - trained with a desired answer = 1 for signal and 0 for background, the approximated function is the probability of signal knowing the input values. This second theorem is the basic ground for all classification applications.

The Multi-Layer perceptron interface in PAW:

- can be used for both approximation and classification tasks. - provides efficient minimisation methods to determine the weights. - allows to interactively define, train and use the neural network.

More precisely, it is possible to:

- define the network structure - modify the default learning parameters - read/write weight files - define the examples from ASCII files, histograms or Ntuples. When examples are defined from Ntuples, selection criteria may be added - train the network and follow the learning curve while training - plot the Perceptron function in case of 1d or 2d fits, write out the function for use in any other code.

/MLP/CREATE nin [ nhid1 nhid2 nout ]

NIN I Number of neurons in the input layer R=1:100
NHID1 I Number of neurons in the first hidden layer D=10 R=0:100
NHID2 I Number of neurons in the second hidden layer D=0 R=0:100
NOUT I Number of neurons in the output layer D=1 R=1:100

Creates a Neural Network.

Example:

PAW > mlp/create 2 4

creates a neural network with 2 inputs, 4 hidden neurons and one output.

PAW > mlp/create 2 4 ! 2

creates a neural network with 2 inputs, 4 hidden neurons and two outputs.

/MLP/STATUS

Prints the status of MLP package.

The parameter printed are: size of the network, learning method and parameters, number of examples loaded...

/MLP/LMET lmet [ par1 par2 par3 ]

LMET I Learning Method R=1:7
PAR1 R First parameter D=-999.
PAR2 R Second parameter D=-999.
PAR3 R Third parameter D=-999.

Set learning method and parameters.

The following methods are available:

- 1: stochastic minimization (often wrongly called "standard online back propagation") - 2: steepest descent with fixed steps ("batch back propagation") - 3: steepest descent with line search - 4: conjugate gradients with Polak-Ribiere updating formula - 5: conjugate gradients with Fletcher-Reeves updating formula - 6: Broyden - Fletcher - Goldfarb - Shanno (BFGS) method - 7: Hybrid linear-BFGS method

For methods 1 and 2:

- PAR1 = learning parameter (default 0.1), - PAR2 = momentum term (default 0.), - PAR3 = decay factor (default 1.).

For methods 3 -> 6:

- PAR1 = reset frequency (default = 1000 epochs), - PAR2 = tau value for line search (default = 1.5)

For method 7: in addition to the parameters used by methods 3 -> 6, PAR3 = regularisation term (default = 1)

By default, MLPfit uses the BFGS learning method, which is stable and probably efficient enough for most applications.

/MLP/RESET

Reset the neural network.

Reset everything concerning the neural net to 0, frees memory.

/MLP/LEARN nepoch [ chopt filename ]

NEPOCH I Number of epochs
CHOPT C Options D=' '
FILENAME C Name of the MLP function D='pawmlp.f'

CHOPT:

'+' Start from previous weights (by default start from random weights)
'I' Change random weights to start with
'Q' Quiet mode
'N' No drawing: the learning curve is not displayed
'B' Keep the weights from the smallest error on the test sample

Train the Neural Network for NEPOCH epochs.

The learning curve is saved in histogram 2000000 (which is reset if already existing). If a test file is also used, the error curve on the test examples is stored in histogram 2000001.