Performance Bounds for Rollout Policies in Stochastic Shortest Path Problems
Anders Hansson, Bo Wahlberg
- Year
- 2026
- Access
- Open access
Abstract
This paper concerns rollout and certainty-equivalent rollout policies for stochastic shortest path problems with absorbing terminal states. The main result provides a direct non-asymptotic performance certificate for a fixed rollout policy: the loss relative to the optimal value is controlled by the uniform accuracy of the value approximation and by the expected time for which the rollout closed loop remains away from the terminal state. Thus, in the undiscounted transient setting, the expected hitting time plays the role of a discount or finite-horizon parameter in more standard approximate dynamic programming bounds. This paper also gives a performance-difference identity showing that suboptimality is exactly accumulated through the transient occupation measure, and a deterministic sharpness example showing that the hitting-time factor is unavoidable. Finally, consequences under uniform hitting-time and Foster-Lyapunov drift conditions are given, and extend the argument to certainty-equivalent rollout by adding a separate local model-mismatch term.
Keywords
Related papers
A dual-loop framework for manufacturability-aware topology optimization of electric vehicle structures via wire arc additive manufacturing
Qiang Cui, Chuan Yu, Daoqian Yang +2 more
Robotics and Computer-Integrated Manufacturing · 2026
Geometric digital twin: A digital and intelligent model for aero-engine assembly accuracy prediction
Ke Shang, Xin Jin, Teli Xu +4 more
Robotics and Computer-Integrated Manufacturing · 2026
Revolutionizing Industries Through AI-Driven Robotics
Aryan Chaudhary
Recent Advances in Computer Science and Communications · 2026
Design and dynamic performance prediction of a novel large-aperture offset-feed deployable antenna
Chuang Shi, Tianming Liu, Ning Xue +6 more
Aerospace Science and Technology · 2026