Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach

Yildiz, Sedat; ŞAHİNASLAN, ÖNDER; BORANDAĞ, EMİN; YÜCALAR, FATİH

doi:10.38088/jise.1753889

Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach

Yildiz S. A., ŞAHİNASLAN Ö., BORANDAĞ E., YÜCALAR F.

Journal of Innovative Science and Engineering (JISE), cilt.10, sa.1, ss.138-157, 2026 (TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 10 Sayı: 1
Basım Tarihi: 2026
Doi Numarası: 10.38088/jise.1753889
Dergi Adı: Journal of Innovative Science and Engineering (JISE)
Derginin Tarandığı İndeksler: TR DİZİN (ULAKBİM)
Sayfa Sayıları: ss.138-157
Manisa Celal Bayar Üniversitesi Adresli: Evet

Özet

This study aims to systematically compare the performance of two deep reinforcement learning algorithms – Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) – across different game environments. To achieve this, eight distinct test environments from the OpenAI Gymnasium library (CartPole-v1, FrozenLake-v1, LunarLander-v3, Taxi-v3, MountainCar-v0, Blackjack-v1, CliffWalking-v0, and Acrobot-v1) were utilized. Each environment was trained over 1,000,000 timesteps. For each algorithm, key performance metrics such as average reward, training time, standard deviation, success rate, and the highest and lowest reward values were calculated and visualized through graphs. Additionally, the strengths and weaknesses of the algorithms in different environments were analyzed. The results indicate that PPO performs more consistently and effectively in tasks requiring continuous actions, whereas DQN achieves faster and more reliable outcomes in deterministic environments with discrete action spaces. This study provides meaningful insights by comparing the performance of PPO and DQN under identical conditions, while most prior research has examined these algorithms separately.