Research Article

Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach

Volume: 10 Number: 1 April 11, 2026

Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach

Abstract

This study aims to systematically compare the performance of two deep reinforcement learning algorithms – Proximal Policy Optimization (PPO) and Deep Q-Network (DQN) – across different game environments. To achieve this, eight distinct test environments from the OpenAI Gymnasium library (CartPole-v1, FrozenLake-v1, LunarLander-v3, Taxi-v3, MountainCar-v0, Blackjack-v1, CliffWalking-v0, and Acrobot-v1) were utilized. Each environment was trained over 1,000,000 timesteps. For each algorithm, key performance metrics such as average reward, training time, standard deviation, success rate, and the highest and lowest reward values were calculated and visualized through graphs. Additionally, the strengths and weaknesses of the algorithms in different environments were analyzed. The results indicate that PPO performs more consistently and effectively in tasks requiring continuous actions, whereas DQN achieves faster and more reliable outcomes in deterministic environments with discrete action spaces. This study provides meaningful insights by comparing the performance of PPO and DQN under identical conditions, while most prior research has examined these algorithms separately.

Keywords

References

  1. Aldana Guerra, E. (2024). Optimized Monte Carlo Tree Search for Enhanced Decision Making in the FrozenLake Environment. arXiv preprint arXiv:2409.16620.
  2. Andrychowicz, M., Wolski, F., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., Zaremba, W. (2017). Hindsight experience replay. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA.
  3. Arulkumaran, K., Deisenroth, M. P., Brundage, M., Bharath, A. A. (2017). A brief survey of deep reinforcement learning. IEEE Signal Processing Magazine, 34(6), 26–38.
  4. Bertolotti, F., Roman, S. (2024). Balancing long-term and short-term strategies in a sustainability game, iScience, 27(6), 110020.
  5. Boubaker, O. (2013). The Inverted Pendulum Benchmark in Nonlinear Control Theory: A Survey. International Journal of Advanced Robotic Systems, 10, 233.
  6. Brockman, G., Cheung, V., Pettersson, L., Schneider, J., Schulman, J., Tang, J., & Zaremba, W. (2016). OpenAI Gym. arXiv preprint arXiv:1606.01540.
  7. Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P. (2016). Benchmarking Deep Reinforcement Learning for Continuous Control. Proceedings of the 33rd International Conference on Machine Learning (ICML), 1329–1338.
  8. Fan, J., Wang, Z., Xie, Y., & Yang, Z. (2020). A Theoretical Analysis of Deep Q‑Learning. Proceedings of the 2nd Conference on Learning for Dynamics and Control, PMLR 120:486–489.

Details

Primary Language

English

Subjects

Reinforcement Learning

Journal Section

Research Article

Publication Date

April 11, 2026

Submission Date

July 29, 2025

Acceptance Date

November 8, 2025

Published in Issue

Year 2026 Volume: 10 Number: 1

APA
Yildiz, S. A., Şahinaslan, Ö., Borandag, E., & Yücalar, F. (2026). Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach. Journal of Innovative Science and Engineering, 10(1), 138-157. https://doi.org/10.38088/jise.1753889
AMA
1.Yildiz SA, Şahinaslan Ö, Borandag E, Yücalar F. Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach. JISE. 2026;10(1):138-157. doi:10.38088/jise.1753889
Chicago
Yildiz, Sedat Atakan, Önder Şahinaslan, Emin Borandag, and Fatih Yücalar. 2026. “Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach”. Journal of Innovative Science and Engineering 10 (1): 138-57. https://doi.org/10.38088/jise.1753889.
EndNote
Yildiz SA, Şahinaslan Ö, Borandag E, Yücalar F (April 1, 2026) Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach. Journal of Innovative Science and Engineering 10 1 138–157.
IEEE
[1]S. A. Yildiz, Ö. Şahinaslan, E. Borandag, and F. Yücalar, “Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach”, JISE, vol. 10, no. 1, pp. 138–157, Apr. 2026, doi: 10.38088/jise.1753889.
ISNAD
Yildiz, Sedat Atakan - Şahinaslan, Önder - Borandag, Emin - Yücalar, Fatih. “Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach”. Journal of Innovative Science and Engineering 10/1 (April 1, 2026): 138-157. https://doi.org/10.38088/jise.1753889.
JAMA
1.Yildiz SA, Şahinaslan Ö, Borandag E, Yücalar F. Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach. JISE. 2026;10:138–157.
MLA
Yildiz, Sedat Atakan, et al. “Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach”. Journal of Innovative Science and Engineering, vol. 10, no. 1, Apr. 2026, pp. 138-57, doi:10.38088/jise.1753889.
Vancouver
1.Sedat Atakan Yildiz, Önder Şahinaslan, Emin Borandag, Fatih Yücalar. Comparing the Performance of PPO and DQN Algorithms in Different Game Environments Using a Reinforcement Learning Approach. JISE. 2026 Apr. 1;10(1):138-57. doi:10.38088/jise.1753889


Creative Commons License

The works published in Journal of Innovative Science and Engineering (JISE) are licensed under a  Creative Commons Attribution-NonCommercial 4.0 International License.