Mean Field Analysis of Neural Networks

Justin Sirignano, Konstantinos Spiliopoulos

Year: 2018
Citations: 32

Abstract

Machine learning has revolutionized fields such as image, text, and speech recognition. There's also growing interest in applying machine and deep learning ideas in engineering, robotics, biotechnology, and finance. Despite their immense success in practice, there is limited mathematical understanding of neural networks. We mathematically study neural networks in the asymptotic regime of simultaneously (A) large network sizes and (B) large numbers of stochastic gradient descent training iterations. We rigorously prove that the empirical distribution of the neural network parameters converges to the solution of a nonlinear partial differential equation. This result can be considered a law of large numbers for neural networks. In addition, a consequence of our analysis is that the trained parameters of the neural network asymptotically become independent, a property which is commonly called propagation of chaos.

Keywords

Artificial neural networkArtificial intelligenceStochastic gradient descentComputer scienceField (mathematics)Nonlinear systemStochastic neural networkProperty (philosophy)Deep learningPartial differential equation

Mean Field Analysis of Neural Networks

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory