Home /Research /Showing Your Offline Reinforcement Learning Work: Online Evaluation\n Budget Matters
LEARNING

Showing Your Offline Reinforcement Learning Work: Online Evaluation\n Budget Matters

Vladislav Kurenkov, С. В. Колесников

Year
2021
Citations
2
Access
Open access

Abstract

In this work, we argue for the importance of an online evaluation budget for\na reliable comparison of deep offline RL algorithms. First, we delineate that\nthe online evaluation budget is problem-dependent, where some problems allow\nfor less but others for more. And second, we demonstrate that the preference\nbetween algorithms is budget-dependent across a diverse range of\ndecision-making domains such as Robotics, Finance, and Energy Management.\nFollowing the points above, we suggest reporting the performance of deep\noffline RL algorithms under varying online evaluation budgets. To facilitate\nthis, we propose to use a reporting tool from the NLP field, Expected\nValidation Performance. This technique makes it possible to reliably estimate\nexpected maximum performance under different budgets while not requiring any\nadditional computation beyond hyperparameter search. By employing this tool, we\nalso show that Behavioral Cloning is often more favorable to offline RL\nalgorithms when working within a limited budget.\n

Keywords

HyperparameterReinforcement learningComputer scienceArtificial intelligenceMachine learningOnline and offlineRange (aeronautics)Field (mathematics)PreferenceComputation

Related papers

Browse all LEARNING papers