Q learning for mobile robot navigation in indoor environment
D. Tamilselvi, S. Mercy Shalinie, G. Nirmala
- Year
- 2011
- Citations
- 10
Abstract
This Proposed Reinforcement learning supports for optimal path selection for Mobile Robot Navigation in an indoor grid (10×10) environment. Without Prior knowledge in the environment, mobile robot calculates Q-values using current and future discounted reward in each time step. Based on the reinforcement, mobile robot learns the environment, selects the navigation path to reach the goal. Markov Decision Process (MDP) supports for optimal path selection among the paths values calculated through reinforcement learning. Simulation experiments are performed with different positions in the environment. From the start position of grid cell value 10, to reach the goal position 100, with learning rate 0.5 and the Q-value is 70.78. MDP provides the optimal path based on the highest Q-value among the four directions in a grid cell such as 100.00, 125.00, 0.00, 83.33 in the grid cell 21 (one step left direction) and chooses the 100.00 for a single step. After the entire learning environment the Q-value 206.780 is achieved for the learning rate 0.7 for the learning path from grid cell 25 to goal position 100. The learning rate makes the optimized path selection in proposed environment. The optimized path cost of 1083.19 without obstacle cost is for the proposed simulation grid environment.
Keywords
Related papers
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002