Dynamic Regret in Time-varying MDPs with Intermittent Information
Negin Musavi, Melkior Ornik
- Year
- 2026
- Access
- Open access
Abstract
We study sequential decision-making in time-varying Markov decision processes (TVMDPs) under limited update rates, where the decision-maker observes the system and updates its model only intermittently. Such settings arise in applications with sensing, communication, or computational constraints that preclude continuous adaptation. Our goal is to understand how the performance of an agent, which learns and plans using receding-horizon control under these information constraints, degrades as a function of the update rate. We propose a skip-update learning and planning framework that combines likelihood-based estimation of time-varying transition kernels with finite-horizon planning and executes policies between updates using stale information. We analyze its performance via dynamic regret relative to an oracle policy with full knowledge of the dynamics and continuous observations. Our main result establishes a dynamic regret bound that explicitly quantifies the impact of intermittent updates, decomposing regret into contributions from update times and skip intervals and revealing its dependence on temporal variation, estimation uncertainty, and the duration of intervals without updates. In particular, the dominant contribution from skip intervals admits a linear dependence on the interval length and the rate of temporal variation, while its effect is mitigated by mixing-induced contraction.
Keywords
Related papers
A dual-loop framework for manufacturability-aware topology optimization of electric vehicle structures via wire arc additive manufacturing
Qiang Cui, Chuan Yu, Daoqian Yang +2 more
Robotics and Computer-Integrated Manufacturing · 2026
Geometric digital twin: A digital and intelligent model for aero-engine assembly accuracy prediction
Ke Shang, Xin Jin, Teli Xu +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Revolutionizing Industries Through AI-Driven Robotics
Aryan Chaudhary
Recent Advances in Computer Science and Communications · 2026
Design and dynamic performance prediction of a novel large-aperture offset-feed deployable antenna
Chuang Shi, Tianming Liu, Ning Xue +6 more
Aerospace Science and Technology · 2026