Applying Artificial Intelligence to

Dr. Mario

[ About | Video | Download | Run | Details ]


This article describes a program capable of playing Dr. Mario, the falling block tile-matching video game. The program is written in Java and it executes within or alongside of Nintaco, the NES / Famicom emulator. In a 1 Player Game, the program automatically advances through the stages of increasing difficultly, pausing only during the cutscenes. In a 2 Player Game, the program takes control of the second player, providing an artificial competitor for a human opponent.


Watch the AI plow through levels 20–24 in the video below.

Level 24 is the highest level; it repeats indefinitely.


The .zip contains:


The AI programmatically controls Nintaco as it emulates Dr. Mario. To see it in action:

  1. Start Nintaco and open Dr. Mario (JU).
  2. Extract DrMarioAI.jar from the downloaded .zip.
  3. Open the Run Program window by selecting Tools | Run Program...
  4. Enter the path to the file in the JAR name field or browse to it using the Find JAR... button.
  5. Press Load JAR to load it.
  6. Press Run.
  7. Select 1 PLAYER GAME from the main menu.
  8. Select any VIRUS LEVEL and SPEED from the successive menu.
  9. Finally, press Start to let the AI take over.

To see it play faster, first, press the Stop button. In the Arguments field, enter one word: fast. Then, press Run. Press Start from the 1 PLAYER GAME menu to hand control to the faster AI.

To watch it play even faster, select Machine | Speed | Max.

To play against the AI, resume normal speed (Machine | Speed | Normal), reset the game (Machine | Reset) and select 2 PLAYER GAME from the main menu. Then, choose any VIRUS LEVEL and SPEED for the players. Finally, press Start to begin competitive play with the AI in control of the second player.


The program uses the Nintaco API to access CPU Memory, to manipulate controller input and to receive frame rendered events. Most of the memory addresses were obtained from the Dr. Mario RAM map provided by The remainder were discovered via the Nintaco Hex Editor. See in the source tree for a list of the addresses used. Note, the player 2 addresses are the player 1 addresses offset by 128 bytes, except for the playfield, which is offset by 256 bytes.

When a vitamin capsule spawns, the program examines every possible lock combination of the current capsule and the next capsule. A valid lock position of a single capsule is a position in which one or both of the tiles making up the capsule is supported by the playfield floor or by a non-empty tile. All the possible lock positions of a single capsule can be computing via a graph traversal algorithm. And, in this case, breath-first search is used (see Locking a capsule into the playfield has consequences: rows/columns of 4 or more identical tiles are removed, unsupported tiles drop, and that process potentially repeats chain-reaction style. For each valid lock position of the current piece and its associated playfield consequences, the program tries each valid lock position of the next piece, evaluating the combined consequences. This chain of searches is represented by

Each of the combined consequences is fed to an evaluation function, which scores the contents of the playfield. The combo with the highest score wins out and the current capsule is moved into position accordingly. The results of the search chain only affect the current capsule. When the next capsule spawns, it is paired up with the next-next capsule and so on.

The evaluation function scores a playfield using a series of metrics:

  1. The number of viruses, the fewer the better.
  2. Consecutive colors. It scans all rows and all columns, measuring lengths of uniformly colored tiles, when empty tiles are ignored. Longer lengths are preferred.
  3. Virus color adjacency. It scans the row and the column containing each virus, comparing the tiles against the virus color. Matching colors are preferred over empty tiles, and they are highly preferred over non-matching ones. In addition, near matching tiles are preferred over distant matching tiles.
  4. The number of non-empty tiles, the fewer the better.
  5. The sum of the column heights, the smaller the better. Columns touching the top of the playfield are highly penalized, since they partition the playfield.

These metrics are combined into a single number via a weighted sum. See for details. Note that the strength of the AI is a combination of the metrics used and the weighting of those metrics. Search algorithms can be used to improve both. However, for this project, the simplest metrics were selected and the weights were drawn from intuition rather than search. The reader is encouraged to experiment and to improve the evaluation function. In fact, it was interfaced out (see for that very reason.

Copyright © 2017
This project is free software; you can redistribute it and/or modify it under the terms of LGPLv2.1.