Neural network training

Neural networks, Machine learning, Optimization

Neural network training as development in program space

A neural network as a whole can be seen as a dynamical system. Its state is the collection of its parameters, and its evolution function is the optimization step taken when training the network.

A neural network has parameters \(\theta_t\) at time \(t\) which can be seen as its state. In standard supervised learning, the parameters are updated by the chosen optimization algorithm and a set of training pairs \((\bm{X}_t, \bm{Y}_t)\). This is the update rule changing the state of that dynamical system at each training step.

In such a framework, the goal of training the neural network is to reach a form of attractor: further optimization steps don’t change the state (parameters) of the neural network.

This attractor should correspond to useful functional properties for the network, a measured by a cost function. Meta-learning can be used to learn the evolution function itself to make the dynamical system converge to better attractors in the least amount of steps.

Program evolution

A neural network is a program, an algorithm. Its parameters specify a sequence of steps from input data to output prediction. Training a neural network is like moving in the algorithmic space towards programs with better performance according to a given cost function.

Last changed | authored by


← Back to Notes