Embracing Change: Continual Learning in Deep Neural Networks
Raia Hadsell, Dushyant Rao, Andrei A. Rusu, Razvan Pascanu
- Year
- 2020
- Citations
- 451
- Access
- Open access
Abstract
Modern machine learning excels at training powerful models from fixed datasets and stationary environments, often exceeding human-level ability.Yet, these models fail to emulate the process of human learning, which is efficient, robust, and able to learn incrementally, from sequential experience in a non-stationary world.Insights into this limitation can be gleaned from the nature of neural network optimization, which implies that continual learning techniques could radically improve deep learning as well as open the door to new application areas.Promising approaches for continual learning can be found at the most granular level, with gradient-based methods, as well as at the architectural level, with modular and memory-based approaches. We also consider meta-learning as a potentially important direction. Artificial intelligence research has seen enormous progress over the past few decades, but it predominantly relies on fixed datasets and stationary environments. Continual learning is an increasingly relevant area of study that asks how artificial systems might learn sequentially, as biological systems do, from a continuous stream of correlated data. In the present review, we relate continual learning to the learning dynamics of neural networks, highlighting the potential it has to considerably improve data efficiency. We further consider the many new biologically inspired approaches that have emerged in recent years, focusing on those that utilize regularization, modularity, memory, and meta-learning, and highlight some of the most promising and impactful directions. Artificial intelligence research has seen enormous progress over the past few decades, but it predominantly relies on fixed datasets and stationary environments. Continual learning is an increasingly relevant area of study that asks how artificial systems might learn sequentially, as biological systems do, from a continuous stream of correlated data. In the present review, we relate continual learning to the learning dynamics of neural networks, highlighting the potential it has to considerably improve data efficiency. We further consider the many new biologically inspired approaches that have emerged in recent years, focusing on those that utilize regularization, modularity, memory, and meta-learning, and highlight some of the most promising and impactful directions. A common benchmark for success in artificial intelligence is the ability to emulate human learning. We measure the abilities of humans to recognize images, play games, and drive a car, to name a few, and then develop machine learning models that can match or exceed these given enough training data. This paradigm puts the emphasis on the end result, rather than the learning process, and overlooks a critical characteristic of human learning: that it is robust to changing tasks and sequential experience. It is perhaps unsurprising that humans can learn this way, after all, time is irreversible and the world is non-stationary (see Glossary), so human learning has evolved to thrive in dynamic learning settings. However, this robustness is in stark contrast to the most powerful modern machine learning methods, which perform well only when presented with data that are carefully shuffled, balanced, and homogenized. Not only do these models underperform when presented with changing or incremental data regimes, in some cases they fail completely or suffer from rapid performance degradation on earlier learned tasks, known as catastrophic forgetting. What might be gained by developing neural network models that learn sequentially like humans? First of all, many applications could benefit from continual adaptation to a changing target specification: for example, visual recognition algorithms that need to learn a diverse, growing set of image classes; or household robots that need to incrementally add skills to their repertoire. Continual learning techniques could enable models to acquire specialized solutions w
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002