What you reward is what you learn: Comparing rewards for online speech policy optimization in public HRI
Sichao Song, Yuki Okafuji, Kaito Ariu, Amy Koike
- Year
- 2026
- Access
- Open access
Abstract
Designing policies that are both efficient and acceptable for conversational service robots in open and diverse environments is non-trivial. Unlike fixed, hand-tuned parameters, online learning can adapt to non-stationary conditions. In this paper, we study how to adapt a social robot's speech policy in the wild. During a 12-day in-situ deployment with over 1,400 public encounters, we cast online policy optimization as a multi-armed bandit problem and use Thompson sampling to select among six actions defined by speech rate (slow/normal/fast) and verbosity (concise/detailed). We compare three complementary binary rewards--Ru (user rating), Rc (conversation closure), and Rt (>=2 turns)--and show that each induces distinct arm distributions and interaction behaviors. We complement the online results with offline evaluations that analyze contextual factors (e.g., crowd level, group size) using video-annotated data. Taken together, we distill ready-to-use design lessons for deploying online optimization of speech policies in real public HRI settings.
Keywords
Related papers
Review and perspectives on multimodal perception, mutual cognition, and embodied execution for human–robot collaboration in Industry 5.0
Kai Ding, Qingyuan Mao, Yaqian Zhang +3 more
Robotics and Computer-Integrated Manufacturing · 2026
Towards human-centric manufacturing: Task planning under uncertainties in human–robot collaborative assembly
Yingchao You, Ze Ji, Changyun Wei
Robotics and Computer-Integrated Manufacturing · 2026
Agentic HRC: Achieving context alignment via memory for Human–Robot Collaboration
Jiahui Si, Wenchao Li, Xi Chen +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Adaptive Physics-informed Transformer with Gaussian process residual compensation for inverse dynamics modeling in Human–Robot Collaboration
Rui Qian, Xi Zhang, Dongpeng Li +2 more
Robotics and Computer-Integrated Manufacturing · 2026