首页 /研究 /A-EXP4: Online Social Policy Learning for Adaptive Robot-Pedestrian Interaction
LEARNING

A-EXP4: Online Social Policy Learning for Adaptive Robot-Pedestrian Interaction

Pengju Jin, Eshed Ohn-Bar, Kris Kitani, Chieko Asakawa

发表年份
2019
引用次数
2

摘要

We study self-supervised adaptation of a robot's policy for social interaction, i.e., a policy for active communication with surrounding pedestrians through audio or visual signals. Inspired by the observation that humans continually adapt their behavior when interacting under varying social context, we propose Adaptive EXP4 (A-EXP4), a novel online learning algorithm for adapting the robot-pedestrian interaction policy. To address limitations of bandit algorithms in adaptation to unseen and highly dynamic scenarios, we employ a mixture model over the policy parameter space. Specifically, a Dirichlet Process Gaussian Mixture Model (DPMM) is used to cluster the parameters of sampled policies and maintain a mixture model over the clusters, hence effectively discovering policies that are suitable to the current environmental context in an unsupervised manner. Our simulated and real-world experiments demonstrate the feasibility of A-EXP4 in accommodating interaction with different types of pedestrians while jointly minimizing social disruption through the adaptation process. While the A-EXP4 formulation is kept general for application in a variety of domains requiring continual adaptation of a robot's policy, we specifically evaluate the performance of our algorithm using a suitcase-inspired assistive robotic platform. In this concrete assistive scenario, the algorithm observes how audio signals produced by the navigational system affect the behavior of pedestrians and adapts accordingly. Consequently, we find A-EXP4 to effectively adapt the interaction policy for gently clearing a navigation path in crowded settings, resulting in significant reduction in empirical regret compared to the EXP4 baseline.

关键词

Computer scienceReinforcement learningAdaptation (eye)Context (archaeology)RobotArtificial intelligenceProcess (computing)RegretMachine learningMixture model

相关论文

查看 LEARNING 分类全部论文