Home /Research /Reward-penalty reinforcement learning scheme for planning and reactive behaviour
LEARNING

Reward-penalty reinforcement learning scheme for planning and reactive behaviour

A.F.R. Araújo, Andreza Pereira Braga

Year
2002
Citations
8

Abstract

This paper describes a reinforcement learning algorithm that allows a point robot to learn navigation strategies within initially unknown indoor environments with fixed and dynamic obstacles. The knowledge is encoded in two surfaces, called reward and penalty surfaces, that are updated either when a target is found or whenever the robot moves respectively. The proposed policy is suitable for both planning and reactive behaviour. The tests involve different kinds of obstacles: a fixed passage, a barrier, a U-shape obstacle and a simple maze. The results suggest that the model solves the goal-directed exploration problem. Thus, the robot is able to reach a desired goal, starting its movement from any position within the environment, avoiding obstacles, and following a viable trajectory. The robot may get stuck in dynamic obstacles, may depend on randomness to avoid them, and generally does not solve the goal-directed reinforcement learning problem.

Keywords

Reinforcement learningRobotComputer scienceRandomnessTrajectoryObstacleObstacle avoidanceScheme (mathematics)Motion planningMobile robot

Related papers

Browse all LEARNING papers