Home /Research /$O(1/k)$ Finite-Time Bound for Non-Linear Two-Time-Scale Stochastic Approximation

LEARNING

$O(1/k)$ Finite-Time Bound for Non-Linear Two-Time-Scale Stochastic Approximation

Siddharth Chandak

Year: 2025
Access: Open access

Abstract

Two-time-scale stochastic approximation (SA) is an algorithm with coupled iterations which has found broad applications in reinforcement learning, optimization and game control. In this work, we derive mean squared error bounds for non-linear two-time-scale iterations with contractive mappings. In the setting where both stepsizes are order $Θ(1/k)$, commonly referred to as single time-scale SA with multiple coupled sequences, we obtain the first $O(1/k)$ rate without imposing additional smoothness assumptions. In the setting with true time-scale separation, the previous best bound was $O(1/k^{2/3})$. We improve this to $O(1/k^a)$ for any $a<1$ approaching the optimal $O(1/k)$ rate. The key step in our analysis involves rewriting the original iteration in terms of an averaged noise sequence whose variance decays sufficiently fast. Additionally, we use an induction-based approach to show that the iterates are bounded in expectation. Our results apply to Polyak averaging, as well as to algorithms from reinforcement learning, and optimization, including gradient descent-ascent and two-time-scale Lagrangian optimization.

Keywords

cs.LGeess.SYmath.OCstat.ML

$O(1/k)$ Finite-Time Bound for Non-Linear Two-Time-Scale Stochastic Approximation

Abstract

Keywords

Related papers

The Organization of Behavior

Fractional Brownian Motions, Fractional Noises and Applications

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A guide to deep learning in healthcare