首页 /研究 /Bootstrap Policy Iteration for Stochastic LQ Tracking with Multiplicative Noise

LEARNING

Bootstrap Policy Iteration for Stochastic LQ Tracking with Multiplicative Noise

Jiayu Chen, Zhenhui Xu, Xinghu Wang

发表年份: 2025
访问权限: 开放获取

摘要

This paper studies the optimal tracking control problem for continuous-time stochastic linear systems with multiplicative noise. The solution framework involves solving a stochastic algebraic Riccati equation for the feedback gain and a Sylvester equation for the feedforward gain. To enable model-free optimal tracking, we first develop a two-phase bootstrap policy iteration (B-PI) algorithm, which bootstraps a stabilizing control gain from the trivially initialized zero-value start and proceeds with standard policy iteration. Building on this algorithm, we propose a data-driven, off-policy reinforcement learning approach that ensures convergence to the optimal feedback gain under the interval excitation condition. We further introduce a data-driven method to compute the feedforward using the obtained feedback gain. Additionally, for systems with state-dependent noise, we propose a shadow system-based optimal tracking method to eliminate the need for probing noise. The effectiveness of the proposed methods is demonstrated through numerical examples.

关键词

eess.SY

Bootstrap Policy Iteration for Stochastic LQ Tracking with Multiplicative Noise

摘要

关键词

相关论文

The Organization of Behavior

Fractional Brownian Motions, Fractional Noises and Applications

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A guide to deep learning in healthcare