Implementing a Mahjong AI with Rule-Based Logic: Building Strong AI Without Machine Learning

The mahjong game we built at aduce, Janten (janten.net), ships with AI opponents of four different personalities: Balance, Iron Wall, Speed, and Big-Hand. Each brings a distinct playing style. This article covers how we implemented the mahjong AI's decision logic using rule-based techniques — without machine learning.

Why Rule-Based AI Instead of Machine Learning?

We also considered deep reinforcement learning and Monte Carlo Tree Search for the implementation strategy, but chose a rule-based approach for these reasons:

Real-time operation in the browser is required. Janten runs as a web application. Models requiring GPU inference or server round-trips were non-starters
Explainability of actions. Janten displays the reason for each discard as text ("discarding a safe tile," "maximizes acceptance"), which is hard to achieve with a black-box model
Ease of tuning. Managing parameters as numbers makes it intuitive to adjust behavior per personality

We settled on a design that combines evaluation functions based on mahjong theory — shanten count, acceptance count, and so on.

Designing Four AI Personalities (Parameter-Driven Architecture)

Each AI personality is defined by a type AITypeConfig, controlled by six numeric parameters:

export interface AITypeConfig {
  name: string;
  defW: number;    // Defense weight
  atkW: number;    // Offense weight
  ponTh: number;   // Pon decision threshold
  riichiAg: number; // Riichi aggressiveness
  yakuB: number;   // Yaku-pursuit bonus
  spdB: number;    // Speed bonus
}

The four concrete personalities use these parameters:

Type	defW	atkW	yakuB	spdB	Characteristics
BALANCE	1.0	1.0	1.0	0.0	Switches between offense and defense based on the situation
TEPPEKI (Iron Wall)	2.0	0.8	0.5	0.0	Defense-focused, minimal deal-in rate
SPEED	0.3	1.5	0.2	2.0	Heavy use of calls for fastest tenpai
BIGHAND	0.5	1.0	3.0	-1.0	Seeks flushes and high-point hands

The same scoring function with different weights produces entirely different playing styles. For instance, SPEED's spdB: 2.0 significantly boosts evaluation of tiles that get closer to tenpai, while BIGHAND's yakuB: 3.0 strongly preserves tiles relevant to yaku.

Discard Selection Algorithm: Evaluating Shanten and Acceptance

The core discard function aiDiscard computes a score for virtually discarding each tile in the hand, and picks the highest-scoring tile.

for (let i = 0; i < sorted.length; i++) {
  const candidate = sorted[i];
  const remaining = [...sorted];
  remaining.splice(i, 1);

  let score = 0;
  const shantenAfter = calcShanten(remaining);
  const acceptance = countAcceptance(remaining, visibleTiles);
  const shantenDelta = shantenAfter - currentShanten;

  // Heavy penalty for tiles that worsen shanten
  if (shantenDelta > 0) score -= 80 * attackWeight;
  else if (shantenDelta < 0) score += 40 * attackWeight;

  // Higher acceptance count = higher score
  score += acceptance * 3 * attackWeight;
}

Shanten count (number of useful tiles needed to win, minus one) takes the minimum across three patterns: regular form, seven pairs, and thirteen orphans. Acceptance count is "how many tiles remain in play that would reduce shanten by one" — a metric that quantifies the hand's breadth.

Defense Logic: Six-Tier Safe-Tile Classification

When an opponent declares riichi or makes an aggressive call, defense logic activates. Safety is classified in six tiers:

export type SafetyCategory =
  | "genbutsu"    // Same tile already discarded by that player
  | "ryou_suji"   // Both-side suji (both sides of suji tiles discarded)
  | "kata_suji"   // One-side suji (only one side's suji tile discarded)
  | "one_chance"  // One-chance (3 of the same tile visible, etc.)
  | "unknown"     // No information
  | "dangerous";  // Middle tile with no suji

This classification takes the most dangerous category across multiple threat players. Finer ordering within a category uses a danger score (0-100) that accounts for walls (tiles with 3+ visible), proximity to the riichi declaration tile, and suspected flushes.

Dynamic Offense/Defense Balancing by Game Phase

Based on remaining wall tiles, the game is partitioned into three phases, and the offense/defense weights are dynamically adjusted.

export function getGamePhase(wallLen: number): GamePhase {
  if (wallLen >= 50) return "early";  // Early: play wide
  if (wallLen >= 20) return "mid";    // Middle: branch offense/defense by shanten
  return "late";                       // Late: push if tenpai, fold if far
}

Adjustments are applied in five steps:

Type defaults — each AI's base weights
Score situation — BALANCE only: switches offense/defense based on lead/deficit
Last-round correction — overrides all types based on rank and point gap (top player defends, last-place player attacks at full power)
Turn correction — in late game with a distant hand, strengthens defense (multiplicative)
Threat-player correction — if someone has riichi or an aggressive call, strengthens defense

This multi-stage adjustment lets BALANCE make context-adaptive decisions like "full defense when leading in the late game" or "full attack when behind in the last round." TEPPEKI, meanwhile, with consistently high defW, behaves defensively throughout except when in last place in the final round.

Verifying AI Strength via a Headless Simulator

To tune the AI parameters, we used a headless simulator that runs thousands of games rapidly without a UI.

export interface SimConfig {
  games: number;              // Number of games
  gameLength: "tonpu" | "hanchan";
  players: AITypeConfig[];    // AI configuration for 4 players
}

The simulator executes dealing, draws, discards, calls, riichi, and win determination programmatically per game, collecting statistics: rank, win rate, deal-in rate, average score, and more. We iterated parameter adjustments watching results from thousands of hanchan of all-type round-robin matches.

This approach made it possible to quantitatively pinpoint issues like "SPEED's deal-in rate is too high" or "BIGHAND's win rate is too low," converging on a balance where each type has a distinct personality without becoming abnormally weak.

Summary

Rule-based AI distills fundamental mahjong concepts — shanten count, acceptance count, safety — into scoring functions, and separates personalities through parameter weighting. With this design, you can achieve AI strong enough to be enjoyable without machine learning.

Three points are especially important:

Parameter-driven design — one algorithm expresses multiple personalities
Multi-stage offense/defense adjustment — reflects game phase, score situation, and opponents' moves via weight multiplication
Simulator-driven tuning — run many games headlessly and balance based on statistics

If you'd like to actually play against the four AIs, you can try Janten for free.

If you're interested in game AI development or rule-based AI design, please feel free to reach out via Contact. aduce provides consulting on AI and game development.