Reinforcement Learning Explorations

As mentioned in my post about fiddling with Blender, I spent winter break experimenting with new hobbies while home. Given how important reinforcement learning is for robotics right now, I decided that would be a great place to experiment and started by reading the Sutton and Barto textbook.

After a little bit of reading, I decided that a project for my favorite casual playing game (Snake) would work pretty well. It's something with history in reinforcement learning that I could use as a platform to experiment easily and had a game structure that would ease some reward function shaping.

I'm currently running it on my ZBook at home while I go to bed. I'm also setting it up with tmux so that I can run simulations slowly. It can handle much more than my Mac, so I mostly use it for simulations and computation-heavy tasks. I had a basic agent set up for Q-Learning, but this quickly is less usable. My next step is looking into a DQN-based agent with static game board sizes.

I'm able to train programs that perform moderately well, but none that are able to get close to finishing the game. I'm currently on my mac so I don't have access to logs, but I will update this later, probably including a GIF.

My next steps include implementing an agent with DDQN, changing the internal representation for training with larger maps without running a whole new training session, changing horizon outlook for value functions, and then maybe a few others.

RL Model on a 7x7 Board

Above is a DQN agent trained on a 7x7 board with 50000 episodes. It seems to be a little short-sighted, often entrapping itself.