Short-term memory ability of reservoir-based temporal difference learning model
Yu Yoshino, Yuichi Katori
- 发表年份
- 2022
- 引用次数
- 3
- 访问权限
- 开放获取
摘要
A network model with temporal difference (TD) learning and reservoir computing (RC) has been proposed to control autonomous robots. RC is a framework for constructing a recurrent neural network that processes complex time series with less computational cost. TD learning is a framework of reinforcement learning, which realizes that an agent takes actions in an environment to maximize the cumulative reward. The control model using TD learning with RC realize the optimization of agent's action based on the sensory signal that is a continuous-valued time-varying signal. The model uses online reinforcement learning to train the connection weights between the reservoir and the output layer to represent the action value. In the present study, we evaluate the model with a task requiring short-term memory and clarify the reservoir's role in memorizing task-relevant sensory information. We show that the reservoir in the RC-based TD learning model enhances the performance in the memory-required task. The choice of parameter values that specify the reservoir dynamics is critical to ensure performance in the task.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002