Pac-Man AI



For our final project in my CSE 440 class (Artificial Intelligence), we were required to write an artificial intelligence agent that learned to play a game of our choice. I vacillated for awhile. Most people in my class stuck with what they knew: Hunt the Wumpus. As much as I had grown (been forced) to love the game, it had gotten old and I wanted to do some AI work that didn't involve trying to figure out what I'd written the night before in LISP (if you've ever programmed in LISP, you know what I mean). I wanted to do some sort of "classic" game and Namco's "Pac-Man" seemed pretty fitting (not to mention it being one of my favorite games of all time).

I figured I could use the open-sourced MAME (Multiple Arcade Machine Emulator) to suit my needs. By reading the positions of the monsters and Pac-Man in the maze from the Sprite RAM and getting the maze and dot information from the Character RAM, my agent could acquire all the knowledge it would need to be able to learn to play the game. After searching the post-apocalyptic wasteland of the Internet, I stumbled across Pac-Tape which proved to be quite useful in getting the information I needed to read the aforementioned game information. Pac-Tape is a similar sort of endeavor, but it's basically a brute-force search of the entire gamespace, rather than actual AI learning, though later versions did use some sort of learning algorithm. I also found Pac-Man Autoplayer, an attempt to use genetic algorithms to learn to play. This didn't seem like a viable method to me, and indeed upon reading his results it seems that it was unsuccessful.

So, I spent my Thanksgiving break thinking about how to go about creating an AI that would learn to play Pac-Man. I came up with the idea to keep an array of "conditions," or situations my Pac-Man agent might find itself while playing the game. For example, Pac-Man might be eating a dot, with a ghost 5 positions ahead of him, heading toward him. This "condition" would be given a value based on how "deadly" it proves to be -- the more Pac-Man dies when he's in this situation, the higher the value. I eventually came up with a number of binary components that form an index into this array of conditions. These are:

  1. There are dots in front of Pac-Man
  2. Pac-Man is facing a ghost, and the ghost is heading toward Pac-Man.
  3. Pac-Man is facing a ghost, and the ghost is heading away from Pac-Man.
  4. Pac-Man is not facing a ghost, and the ghost is heading toward Pac-Man.
  5. Pac-Man is not facing a ghost, and the ghost is heading away from Pac-Man
  6. There is an "exit" (i.e., an alternate path) between Pac-Man and the closest ghost.
  7. An energizer is active
  8. Pac-Man is in the "tunnel"

In addition to the above 8, a "ghost distance" factor is added in, to differentiate between ghosts being right next to Pac-Man, and ghosts being halfway across the screen. So, for every step the AI agent takes, it figures out exactly which of the 8 components are true, ORs them together, and the result of this is an index into the condition array. If Pac-Man dies before getting to the next intersection in the maze, this condition is incremented; if Pac-Man survives, the condition is decremented. After going through an exploratory period, the AI agent will use this condition array to decide what move to take at intersections, and whether or not to double-back in a potentially dangerous situation. I later learned that this algorithm is called Q-Learning, and that these "condition" values are called "Q-Values." Now I know, and knowing is half the battle. I guess.

The goal of my agent was to learn to successfully complete a maze (i.e., eating all the dots in the maze) without getting a "GAME OVER." As such, it is in no way concerned with score. I feel that my agent was at least partly successful, though I believe that Q-Learning is not a viable method to truly "learn" to play Pac-Man. My agent, given a few hours of runtime, can learn enough to successfully complete a maze at least half the time. The major problem with using Q-Learning like this is that the agent only regards the immediate future in its decision. Additionally, in order to play Pac-Man well, one must learn to anticipate how the ghosts will move, something that my agent is incapable of doing.



Here's the current version of the source code, including all the MAME stuff. It should work, though currently the algorithm isn't working as well as I'd like (I've been experimenting with various things.) My source code is in /src/drivers/pacai.h and /src/drivers/pacai.c. It's rather large. In order to actually run it, you'll need the Pac-Man ROMs, which you'll need to find elsewhere. Note also that this is set to compile under Linux. To compile under another UNIX, edit the Makefile per the documentation in the "doc" directory. Once you've compiled it and put the ROMs where they need to go, you can just run it without any arguments and it'll start up automatically (since MAME's default game is Pac-Man). I recommend using F10 to disable speed-limiting, and letting it run while you go out to dinner or something, since it needs to play a few hundred games to get the hang of it. Anyway, to download, click the big ol' link below.

Download Pac-Man AI


Future Plans

I'd like to work on getting the agent to pay attention to patterns the ghosts follow, based on how Pac-Man moves. This is really the key to the game, knowing exactly how the ghosts react, since Pac-Man is 100% deterministic. This, however, is a very difficult problem to solve. How to tell where one pattern begins and another ends? Which parts of the pattern are "important" (actually influence the game)? How do I work pattern-recognition into the agent?

Return to Projects