Deep Reinforcement Learning: A Chronological Overview and Methods
Juan Terven
- 发表年份
- 2025
- 引用次数
- 49
- 访问权限
- 开放获取
摘要
Introduction: Deep reinforcement learning (deep RL) integrates the principles of reinforcement learning with deep neural networks, enabling agents to excel in diverse tasks ranging from playing board games such as Go and Chess to controlling robotic systems and autonomous vehicles. By leveraging foundational concepts of value functions, policy optimization, and temporal difference methods, deep RL has rapidly evolved and found applications in areas such as gaming, robotics, finance, and healthcare. Objective: This paper seeks to provide a comprehensive yet accessible overview of the evolution of deep RL and its leading algorithms. It aims to serve both as an introduction for newcomers to the field and as a practical guide for those seeking to select the most appropriate methods for specific problem domains. Methods: We begin by outlining fundamental reinforcement learning principles, followed by an exploration of early tabular Q-learning methods. We then trace the historical development of deep RL, highlighting key milestones such as the advent of deep Q-networks (DQN). The survey extends to policy gradient methods, actor–critic architectures, and state-of-the-art algorithms such as proximal policy optimization, soft actor–critic, and emerging model-based approaches. Throughout, we discuss the current challenges facing deep RL, including issues of sample efficiency, interpretability, and safety, as well as open research questions involving large-scale training, hierarchical architectures, and multi-task learning. Results: Our analysis demonstrates how critical breakthroughs have driven deep RL into increasingly complex application domains. We highlight existing limitations and ongoing bottlenecks, such as high data requirements and the need for more transparent, ethically aligned systems. Finally, we survey potential future directions, highlighting the importance of reliability and ethical considerations for real-world deployments.
关键词
相关论文
Statistical Learning Theory
Yuhai Wu, Vladimir Vapnik
1999
Artificial intelligence: a modern approach
1995
Applied Nonlinear Control
Jean-Jacques Slotine, Weiping Li
1991
A new optimizer using particle swarm theory
R.C. Eberhart, James Kennedy
2002