Home /Research /An Error Bound for Aggregation in Approximate Dynamic Programming

LEARNING

An Error Bound for Aggregation in Approximate Dynamic Programming

Yuchao Li, Dimitri Bertsekas

Year: 2025
Access: Open access

Abstract

We consider a general aggregation framework for discounted finite-state infinite horizon dynamic programming (DP) problems. It defines an aggregate problem whose optimal cost function can be obtained off-line by exact DP and then used as a terminal cost approximation for an on-line reinforcement learning (RL) scheme. We derive a bound on the error between the optimal cost functions of the aggregate problem and the original problem. This bound was first derived by Tsitsiklis and van Roy [TvR96] for the special case of hard aggregation. Our bound is similar but applies far more broadly, including to soft aggregation and feature-based aggregation schemes.

Keywords

math.OCeess.SY

An Error Bound for Aggregation in Approximate Dynamic Programming

Abstract

Keywords

Related papers

The Organization of Behavior

Fractional Brownian Motions, Fractional Noises and Applications

Review of deep learning: concepts, CNN architectures, challenges, applications, future directions

A guide to deep learning in healthcare