Applying Artificial Intelligence to Famicom Puyo Puyo
[ About | Videos | Download | Run | Details ]
The original Puyo Puyo, released for the Famicom in 1991, did not provide an AI opponent. To fill that gap, this article provides a Java plug-in for the Nintaco NES / Famicom emulator. In 1 Player Endless Mode, the plug-in plays perpetually. In 2 Player Mode, it serves as a digital adversary. Or, it can be configured to launch two AI's to battle against each other.
They call it "Endless Mode", but the score will eventually max out at 99,999,990:
Check out this epic AI vs. AI battle:
The .zip contains:
To set the speed, first, press the Stop button. In the Arguments field, enter a numerical speed value from 0 (slowest) to 10 (fastest). Then, press Run again. Select a mode from the Puyo Puyo main menu to return control to the AI.
To watch it play even faster, select Machine | Speed | Max.
To have 2 AI's battle it out, first, resume normal speed (Machine | Speed | Normal) and reset the game (Machine | Reset). Next, press the Stop button and in the Arguments field, enter the single word: AIvsAI. Finally, press Run and select 2 Player Mode.
A speed value can also be provided by separating the arguments with a space (e.g. 5 AIvsAI).
The plug-in uses the Nintaco API to manipulate CPU Memory and to receive frame rendered events. All memory addresses were discovered through exploration with the Nintaco Hex Editor and the information has been added to the Data Crystal ROMhacking.net wiki. The plug-in uses a particular set of memory addresses determined by the current mode in which it is operating. The Addresses interface represents a potential set and Player1Addresses, Player2Addresses and Player12Addresses are the implementations for 1 Player Endless Mode, 2 Player Mode and player 1 in AI vs. AI mode, respectively.
When a piece spawns, the program examines every possible lock combination of the current piece and the next piece. A valid lock position of a single piece is a position in which one or both of the Puyo in the pair is supported by the playfield floor or by another Puyo. All of the possible lock positions of a single piece are computed using breath-first search (see Searcher).
Locking a piece into the playfield has consequences: unsupported Puyo drop, connected groups of 4 or more identical Puyo are removed, and that process may repeat chain-reaction style. For each valid lock position of the current piece and its associated playfield consequences, the program tries each valid lock position of the next piece, evaluating the combined consequences. This chain of searches is represented by SearchChain.
Each of the combined consequences is fed to an evaluation function, which scores the contents of the playfield. The combo with the highest score wins out and the current piece is moved into position accordingly. The results of the search chain only affect the current piece. When the next piece spawns, it is paired up with the next-next piece and so on.
The evaluation function scores a playfield using a series of metrics:
These metrics are combined into a single number via a weighted sum (see EvaluationFunction).
To optimize the weights used in the evaluation function, a Trainer was created that executes the following steps:
The idea falls somewhere in between Hill Climbing and a Genetic Algorithm. Better competitors have a higher probability of survival. And, the losers are replaced with offspring that mostly resemble the winners with slight mutations.
Here are the results, rounded to nearest integer percents:
|# Colored Puyo||16|
|# Nuisance Puyo||25|
|# Linked Puyo||25|
Each metric is normalized to a number in between 0 and 1 in attempt to make them comparable. And, if they are, then the most significant metrics appear to be the number of Nuisance Puyo and the number of linked Puyo, followed by the number of other Puyo and the total consecutive colors.
Maxing out the score aside, Endless Mode really does appear to be endless. How far can the AI really go?
Similar to the Trainer, it is possible to run the AI at high speed as a single player outside of Nintaco to push it to the limit. However, doing so reveals that the game still never ends.
Instead, as the AI played through 50 million pieces, the number of Puyo in the playfield was recorded after each lock (see SelfPlayer). The data shows some interesting things. First, the AI cleared the playfield 3,326 times, each effectively equivalent to restarting the game. Next, it never had more than 37 Puyo on playfield; meaning, it never even occupied half of the available space; and, it only hit that number once. Most of the time was spent around 10 Puyo on the playfield.
Below, the results are plotted as a probability curve. The x-axis is the number of Puyo in the playfield and the y-axis is the probably of finding the AI in that state at any given moment.
It resembles a bell curve. The Excel Solver worked out constants for a Gaussian fit:
Assuming that the AI won't give up until the playfield is completely full, then the game won't end until x is 84. According to the Gaussian, the probability of that happening is 1 in 1089. Of course, as the playfield runs out of room, its behavior may radically change. Instead, assume that the AI just spontaneously gives up if it were to inadvertently fill up half of the playfield, where x is 42. The Gaussian says the odds of that happening is 1 in 1017. With either assumption, it sounds like Endless Mode truly does deserves that title.
It should also be pointed out that the AI handles the special items. The Big Puyo is capable of clearing 2 full columns. And, the Carbuncle modifies Puyo that it contacts, changing them to identical colors. The algorithm considers all the configurations of those items when they come up during its search. However, the program that collected the data for the plots above did not simulate special items and the advantages they offer. Consequentially, the AI in a real game may be far stronger than those curves suggest.
Copyright © 2017 meatfighter.com
This project is free software; you can redistribute it and/or modify it under the terms of LGPLv2.1.