An Embodied Model of Learning, Plasticity, and Reward
William H. Alexander, Olaf Sporns
- Year
- 2002
- Citations
- 32
Abstract
We describe and discuss a neural network model of the dopaminergic system based on observed anatomical and physiological properties of the primate midbrain. The model relies on value-dependent synaptic modification to acquire temporal information regarding reward-related events and the stimuli with which such events are paired. Experience-dependent changes in synaptic plasticity allow the model to generate neuromodulatory responses corresponding to prediction errors. These phasic neural responses act as a value signal with positive and negative components, representing the unpredicted occurrence of rewarding stimuli and the omission of an expected reward, respectively. The value signal modulates widespread synaptic changes, including afferent connections of the value system itself. The model is embedded in an autonomous robot, and its behavior is tested as changes are applied to the robot's motor characteristics and as the stimulus content of the environment is varied. We observe the development of the system as a consequence of environmental stimuli and autonomous movement, leading to the conditioning of reward-related behaviors through the interaction between the robot and its surroundings.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002