首页 /研究 /MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic.
LEARNING

MAGIC: Learning Macro-Actions for Online POMDP Planning using Generator-Critic.

Yiyuan Lee, Panpan Cai, David Hsu

发表年份
2020
引用次数
2

摘要

When robots operate in the real-world, they need to handle uncertainties in sensing, acting, and the environment. Many tasks also require reasoning about long-term consequences of robot decisions. The partially observable Markov decision process (POMDP) offers a principled approach for planning under uncertainty. However, its computational complexity grows exponentially with the planning horizon. We propose to use temporally-extended macro-actions to cut down the effective planning horizon and thus the exponential factor of the complexity. We propose Macro-Action Generator-Critic (MAGIC), an algorithm that learns a macro-action generator from data, and uses the learned macro-actions to perform long-horizon planning. MAGIC learns the generator using experience provided by an online planner, and in-turn conditions the planner using the generated macro-actions. We evaluate MAGIC on several long-term planning tasks, showing that it significantly outperforms planning using primitive actions, hand-crafted macro-actions, as well as naive reinforcement learning in both simulation and on a real robot.

关键词

Partially observable Markov decision processMacroComputer sciencePlannerReinforcement learningMAGIC (telescope)Time horizonGenerator (circuit theory)Artificial intelligenceAction (physics)

相关论文

查看 LEARNING 分类全部论文