Home /Research /Learning Robust Quadruped Locomotion using Dynamic Reward Shaping and Domain Randomization
LOCOMOTION

Learning Robust Quadruped Locomotion using Dynamic Reward Shaping and Domain Randomization

Wenshuo Liu, Ning Tan

Year
2025
Citations
1

Abstract

Currently, the field of motion control for quadruped robots still faces numerous challenges. One approach to addressing these challenges is the use of deep reinforcement learning techniques, which involve the design of robust reinforcement learning algorithms capable of handling uncertain sensor inputs. These algorithms optimize and execute motion control in dynamic and unpredictable environments, thus enabling legged robots to perform more complex and difficult tasks. In this study, we propose a method for training a robot control policy using deep reinforcement learning. Specifically, we model the control policy as a Multi-Layer Perceptron (MLP) and train it within a simulation environment. To improve the policy’s adaptability to noisy sensor inputs, we implement domain randomization for each sensor input. By leveraging a parallel simulation framework and implementing a high-performance reinforcement learning algorithm, we successfully developed a policy that can accurately track velocity commands, including those significantly higher than those encountered during training, within just one hour. Furthermore, our method demonstrates the ability to guide the robot into specified gaits, such as trotting, by incorporating gait-specific reward functions during the training process.

Keywords

Computer scienceDomain (mathematical analysis)Artificial intelligenceControl theory (sociology)Mathematics

Related papers

Browse all LOCOMOTION papers