首页 /研究 /Safe and efficient imitation learning by clarification of experienced latent space

LEARNING

Safe and efficient imitation learning by clarification of experienced latent space

Hidehito Fujiishi, Taisuke Kobayashi, Kenji Sugimoto

发表年份: 2021
引用次数: 6

摘要

Behavioral cloning from observation (BCO) allows the robot to learn the policy without the expert's action information. However, it requires a few interactions with the environment to infer expert's action with risk of robot failures. In addition, BCO assumes that the inferred action is of accurate, causing wrong and inefficient updates of the policy. Both problems can be resolved by outlier detection whether the faced state is experienced or not. This paper addresses such outlier detection mechanisms using variational autoencoder (VAE) to improve safety and efficiency of the standard BCO. For the first safety problem, we suppose that the expert's demonstrations only visited the safe states, and then, VAE is learned by the expert's state data to detect inexperienced and dangerous scenes. For the second efficiency problem, another VAE is trained with the state data safely collected by the imitator's policy to detect the scenes where the inferred actions are not accurate. In handwriting robot experiments, the proposed mechanisms succeeded in improving the standard BCO in terms of both the safety (roughly 64%) and the efficiency (roughly 44%). The high versatility of the proposed mechanisms is verified from learning various alphabets.

关键词

AutoencoderAction (physics)Artificial intelligenceComputer scienceMachine learningRobotOutlierAnomaly detectionImitationSpace (punctuation)

Safe and efficient imitation learning by clarification of experienced latent space

摘要

关键词

相关论文

Statistical Learning Theory

Artificial intelligence: a modern approach

Applied Nonlinear Control

A new optimizer using particle swarm theory