Neural network training

tags: Neural networks, Machine learning, Optimization

A common algorithm for neural network training is backpropagation.

Neural network training as development in program space

A neural network as a whole can be seen as a dynamical system. Its state is the collection of its parameters, and its evolution function is the optimization step taken when training the network.

A neural network has parameters \(\theta_t\) at time \(t\) which can be seen as its state. In standard supervised learning, the parameters are updated by the chosen optimization algorithm and a set of training pairs \((\bm{X}_t, \bm{Y}_t)\). This is the update rule changing the state of that dynamical system at each training step.

In such a framework, the goal of training the neural network is to reach a form of attractor: further optimization steps don’t change the state (parameters) of the neural network.

This attractor should correspond to useful functional properties for the network, a measured by a cost function. Meta-learning can be used to learn the evolution function itself to make the dynamical system converge to better attractors in the least amount of steps, as done in (Tancik et al. 2021).

Program evolution

A neural network is a program, an algorithm. Its parameters specify a sequence of steps from input data to output prediction. Training a neural network is like moving in the algorithmic space towards programs with better performance according to a given cost function.

Bibliography

Matthew Tancik, Ben Mildenhall, Terrance Wang, Divi Schmidt, Pratul P. Srinivasan, Jonathan T. Barron, Ren Ng. March 23, 2021. "Learned Initializations for Optimizing Coordinate-based Neural Representations". Arxiv:2012.02189 [cs]. http://arxiv.org/abs/2012.02189. See notes