Mean Field Analysis of Neural Networks

Justin Sirignano, Konstantinos Spiliopoulos

发表年份: 2018
引用次数: 32

摘要

Machine learning has revolutionized fields such as image, text, and speech recognition. There's also growing interest in applying machine and deep learning ideas in engineering, robotics, biotechnology, and finance. Despite their immense success in practice, there is limited mathematical understanding of neural networks. We mathematically study neural networks in the asymptotic regime of simultaneously (A) large network sizes and (B) large numbers of stochastic gradient descent training iterations. We rigorously prove that the empirical distribution of the neural network parameters converges to the solution of a nonlinear partial differential equation. This result can be considered a law of large numbers for neural networks. In addition, a consequence of our analysis is that the trained parameters of the neural network asymptotically become independent, a property which is commonly called propagation of chaos.

关键词

Artificial neural networkArtificial intelligenceStochastic gradient descentComputer scienceField (mathematics)Nonlinear systemStochastic neural networkProperty (philosophy)Deep learningPartial differential equation

Mean Field Analysis of Neural Networks

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory