首页 /研究 /Global Convergence of Policy Gradient Methods for ReLU Controllers in Linear Quadratic Regulation
OTHER

Global Convergence of Policy Gradient Methods for ReLU Controllers in Linear Quadratic Regulation

Jhojan A. Rodriguez-Gil, César A. Uribe

发表年份
2026
访问权限
开放获取

摘要

We study the convergence of model-based policy gradient for the deterministic, scalar, discounted linear-quadratic regulator when the controller is an overparameterized one-hidden-layer ReLU network without biases. Although the optimal LQR controller is linear, neural parameterization creates a redundant nonconvex weight space with a possibly asymmetric piecewise-linear controller. We show that this structure can still be analyzed exactly through the two effective gains induced on the positive and negative half-lines. Under suitable random initialization, sufficient width, and a small step size, the model-based policy gradient remains stable, decreases the cost geometrically, and drives the effective gains to the unique optimal scalar LQR gain with high probability.

关键词

math.OCeess.SY

相关论文

查看 OTHER 分类全部论文