Advancements in Deep Reinforcement Learning: A Comprehensive Survey on Policy Optimization Techniques

Deepak  Harish

doi:10.63282/3050-9246.IJETCSIT-V1I2P101

Authors

Deepak Harish Associate Professor, Department of Computer Science & Engineering, PSG College of Technology, Coimbatore, India. Author

DOI:

https://doi.org/10.63282/3050-9246.IJETCSIT-V1I2P101

Keywords:

Deep Reinforcement Learning, Policy Optimization, Policy Gradient Methods, Actor-Critic, Trust Region Methods, Proximal Policy Optimization, Model-Based Reinforcement Learning, Sample Efficiency, Exploration Strategies, Generalization in DRL

Abstract

Deep Reinforcement Learning (DRL) has emerged as a powerful paradigm for solving complex decision-making problems in various domains, including robotics, gaming, and autonomous systems. At the core of DRL lies the optimization of policies that map states to actions, enabling agents to learn optimal behaviors through interaction with their environment. This paper provides a comprehensive survey of recent advancements in policy optimization techniques in DRL. We categorize and discuss the key methods, including policy gradient methods, actor-critic algorithms, and model-based approaches. We also explore the challenges and future directions in the field, highlighting the integration of DRL with other machine learning techniques and the application of DRL in real-world scenarios. The paper aims to serve as a valuable resource for researchers and practitioners interested in the latest developments in DRL