首页 /研究 /Inferring Time-delayed Causal Relations in POMDPs from the Principle of Independence of Cause and Mechanism
LEARNING

Inferring Time-delayed Causal Relations in POMDPs from the Principle of Independence of Cause and Mechanism

Junchi Liang, Abdeslam Boularias

发表年份
2021
引用次数
4
访问权限
开放获取

摘要

This paper introduces an algorithm for discovering implicit and delayed causal relations between events observed by a robot at regular or arbitrary times, with the objective of improving data-efficiency and interpretability of model-based reinforcement learning (RL) techniques. The proposed algorithm initially predicts observations with the Markov assumption, and incrementally introduces new hidden variables to explain and reduce the stochasticity of the observations. The hidden variables are memory units that keep track of pertinent past events. Such events are systematically identified by their information gains. A test of independence between inputs and mechanisms is performed to identify cases when there is a causal link between events and those when the information gain is due to confounding variables. The learned transition and reward models are then used in a Monte Carlo tree search for planning. Experiments on simulated and real robotic tasks, and the challenging 3D game Doom show that this method significantly improves over current RL techniques.

关键词

InterpretabilityComputer scienceIndependence (probability theory)Conditional independenceArtificial intelligenceReinforcement learningMachine learningMonte Carlo tree searchMarkov processMonte Carlo method

相关论文

查看 LEARNING 分类全部论文