Home /Research /Fine-tuning Deep Reinforcement Learning Policies with r-STDP for Domain Adaptation

LEARNING

Fine-tuning Deep Reinforcement Learning Policies with r-STDP for Domain Adaptation

Mahmoud Akl, Yulia Sandamirskaya, Deniz Ergene, Florian Walter, Alois Knoll

Year: 2022
Citations: 10

Abstract

Using deep reinforcement learning policies that are trained in simulation on real robotic platforms requires fine-tuning due to discrepancies between simulated and real environments. Multiple methods like domain randomization and system identification have been suggested to overcome this problem. However, sim-to-real transfer remains an open problem in robotics and deep reinforcement learning. In this paper, we present a spiking neural network (SNN) alternative for dealing with the sim-to-real problem. In particular, we train SNNs with backpropagation using surrogate gradients and the (Deep Q-Network) DQN algorithm to solve two classical control reinforcement learning tasks. The performance of the trained DQNs degrades when evaluated on randomized versions of the environments used during training. To compensate for the drop in performance, we apply the biologically plausible reward-modulated spike timing dependent plasticity (r-STDP) learning rule. Our results show that r-STDP can be successfully utilized to restore the network’s ability to solve the task. Furthermore, since r-STDP can be directly implemented on neuromorphic hardware, we believe it provides a promising neuromorphic solution to the sim-to-real problem.

Keywords

Reinforcement learningComputer scienceNeuromorphic engineeringArtificial intelligenceSpiking neural networkArtificial neural networkTask (project management)Adaptation (eye)Machine learningDomain (mathematical analysis)

Fine-tuning Deep Reinforcement Learning Policies with r-STDP for Domain Adaptation

Abstract

Keywords

Related papers

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory