Towards Self-confidence-based Adaptive Learning for Lunar Exploration
Benjamin Mellinkoff, Nisar Ahmed, Jack O. Burns
- 发表年份
- 2020
- 引用次数
- 3
摘要
Future space science exploration missions will require collaboration between humans and autonomous robotic systems, resulting in increased science return. However, autonomous systems will occasionally require intervention from human assistants when faced with large uncertainties in their environments. Since human operator and astronaut time are valuable commodities, it is undesirable for autonomous systems to require frequent human assistance during missions. This work examines a simple strategy for adaptive learning which assesses and attempts to improve the capability of a simulated autonomous lunar exploration system as it completes navigation tasks when presented with environmental uncertainties. Specifically, the autonomous system’s perceived machine self-confidence (i.e. self-trust in resulting outcomes) is used to close the loop in a model-based reinforcement learning algorithm. This adaptive learning algorithm allows for effective and safe online learning, which allows the autonomous system to proceed with its tasks and thereby delay the need for human intervention. This paper defines a simulated lunar exploration grid-world problem to demonstrate a proof of concept for this ‘competency aware’ online learning algorithm, and its performance is evaluated in terms of effectiveness, safety, and speed of learning to reduce the uncertainty in its environment. The results for the simulated lunar grid world navigation task show that this adaptive learning strategy achieves good performance on a variety of terrain maps when compared to other simpler but more traditional learning strategies, as it enables automatic adjustment of exploration hyperparameters on the fly in order to better balance exploration and exploitation.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002