Misha Obukhov
Single and Multi-Agent Reinforcement Learning in the Classic Snake Game Environment
Reinforcement learning has been previously proven to be capable of achieving human level or superior performance in various tasks within defined environments. Here, we investigate the results of applying Proximal Policy Optimization (PPO 2017) to train a network to be capable of playing the classic Snake game. Additionally, we test the results of modifying the environment to allow for multiple agents to be trained in parallel in zero sum competition. We demonstrate that the algorithm is able to consistently solve the standard single agent case and find that multi-agent convergence in a competitive environment can be achieved on a comparable training timescale. Furthermore, we use our data to reiterate the classic Machine Learning result that less neurons canlead to more effective generalization and better performance in some circumstances.