The in-game hint — the button that tells you what to play when you’re stuck — is currently three lines of code. It wasn’t always. For a while we shipped a whole separate training pipeline for hints, with its own models, its own optimizer, and its own dictionary of learned weights. This post is about why we deleted that, and about a deliberate choice we made along the way: we could have built a hint that beats Elite for you, and we didn’t.

A hint is not “the best move”

At first a hint feels like a small feature. Then you try to write it down and realize the question is actually subtle. A hint isn’t “what’s the objectively best move in this position.” It’s what move gives you the highest chance of winning this game, against this specific opponent, from here. Those aren’t the same.

The best move against a tactically sharp Elite AI might be a quiet positional play that denies it the opening it wants. Against Medium, the best move is usually just the biggest immediate score, because Medium will happily give you the whole interior if you show up to take it. A general strong player can sit in a local optimum that an opponent-specific one would beat.

That’s what got us into training opponent-specific hint models in the first place — a separate weight vector for every combination of board size, difficulty, and which side you’re on. Around twenty slots, each trained overnight against a specific opponent, each one needing to be retrained whenever we changed anything about the game.

Why we deleted all of that

Two reasons, roughly equal weight.

One: it was a lot of machinery for a button. Twenty training runs, twenty checkpoints, a retrain on every rule change. For a feature where most players tap it maybe twice a game, that’s an enormous amount of infrastructure.

Two: Elite got strong enough that the extra tuning stopped mattering. When Elite was a simpler model, there was a real gap between “best objective move” and “best hint we could train for your specific opponent.” The opponent-specific tuning genuinely helped. Once we replaced Elite with the distilled CNN, that gap mostly closed. Playing the objectively strongest move turned out to be a near-indistinguishable hint from playing the move specifically optimized against your current opponent, for the vast majority of positions real players ask for help in.

So today, the hint is three lines: call the Elite CNN from your perspective (treating the board as if you were Elite thinking about the position) and return its move. One model, zero retraining, fewer moving parts, and — in practice — indistinguishable from the thing we deleted.

A hint that beats Elite is a hint we don’t want

The other reason we stopped chasing opponent-specific hint models is a design principle, not an engineering one. Elite is meant to be a challenge. That’s the point of the top difficulty. If we shipped a hint that was specifically trained to exploit Elite’s weaknesses, you could mash the hint button every turn and grind out a win you never really earned. The game becomes a payment interface wearing a chessboard.

We absolutely could have built that. We chose not to. Hints should help you notice a strong move you missed — a tactic that’s on the board but invisible to you — not hand you a script that beats an opponent you otherwise can’t beat.

The rule we ended up with: the hint is never stronger than the opponent it’s helping you fight. On Easy, Medium, and Hard, the CNN is overkill, which is fine — those tiers are designed to be beatable and the hint just accelerates you past a mistake. On Elite, the hint is exactly as strong as your opponent, which means a hint tells you what Elite would play in your seat. Useful, but never a shortcut.

Victories against Elite should feel like you dragged them out of the game with your own hands. Anything else cheapens the tier — and by extension, every lower tier that people climbed up through to get there.

For developers

The engineering writeup — how we used CMA-ES (a black-box evolution strategy) to train per-opponent hint models, the configuration that worked, and why deleting a specialist model once the generalist got strong enough is almost always the right call — lives on the Island & Pine studio blog: Training per-opponent hint policies with CMA-ES (and why we deleted the whole system).