Misha Obukhov

UC Santa Barbara
Computer Engineering

Single and Multi-Agent Reinforcement Learning in the Classic Snake Game Environment

Reinforcement learning has been previously proven to be capable of achieving human level or superior performance in various tasks within defined environments. Here, we investigate the results of applying Proximal Policy Optimization (PPO 2017) to train a network to be capable of playing the classic Snake game. Additionally, we test the results of modifying the environment to allow for multiple agents to be trained in parallel in zero sum competition. We demonstrate that the algorithm is able to consistently solve the standard single agent case and find that multi-agent convergence in a competitive environment can be achieved on a comparable training timescale. Furthermore, we use our data to reiterate the classic Machine Learning result that less neurons canlead to more effective generalization and better performance in some circumstances.

UC Santa Barbara Center for Science and Engineering Partnerships UCSB California NanoSystems Institute