Formally Specifying Dungeons and Dragons rulebooks

I’ll do anything to avoid learning tabletop game battle rules. Apparently “anything” includes writing a formal specification of D&D (Dungeons and Dragons) 5e (5th edition, 2024) combat.

D&D is a narrative tabletop game — one of many, albeit the most popular. Players act as fictional characters within a shared story, guided by mechanical rules.

The rules create tactically interesting battles, but I find them hard to learn and remember — I value the storytelling part more. So I made a state machine to run the battle rules for me.

What I built

A Quint formal spec of the 2024 SRD¹ combat rules — conditions, action economy, grappling, death saves, spell slots — modeled as a single-actor state machine. An XState v5 implementation mirrors it exactly.

Model-based conformance testing replays random traces against both and compares state field-by-field. A thin React UI sits on top.

SRD rules text

12K community Q&A

Quint spec

XState machine

React UI

The UI can’t render a state the machine can’t produce, and the machine can’t produce a state the spec forbids.

The MBT traces are seed-deterministic: same seed, same trace, reproducible. I used the same approach for property-based testing of temporal graph storage: deterministic seeds let you reproduce any failure and run millions of scenarios — with the formal spec as oracle² instead of hand-written properties.

What is Quint

Quint is an executable specification language based on the temporal logic of actions (TLA) — the same foundation as TLA+, but with syntax familiar to engineers.

You write pure functions that describe state transitions. Quint model-checks them: exhaustively explores every reachable state and verifies your invariants hold. Built by Informal Systems, the team behind Cosmos blockchain verification.

Crowdsourced spec testing

1,000 machine-checked assertions distilled so far from a corpus of 12,700 Reddit and StackExchange Q&A arguments between dedicated D&D players. An LLM translates them into Quint. Surviving assertions validate the spec against what the community already knows. Still plenty of the corpus left to process.

One example. Grapple a creature, knock it prone. Its speed drops to 0. The Prone condition says: “if your Speed is 0, you can’t right yourself.” Stuck on the ground until the grapple breaks.

An LLM turns a Reddit discussion into a formal assertion, then the spec confirms it:

// qa_generated.qnt
run qa_prone_and_grappled_cannot_stand = {
  val target = freshCreature(30)
  val grappled = pGrapple(Medium, Medium, target, true, true).targetState
  val grappledAndProne = pApplyCondition(grappled, CProne)
  val t = pStartTurn(TEST_CONFIG, grappledAndProne, Unarmored, 0, false, false)
  val tAfterStandAttempt = pStandFromProne(t)
  assert(
    grappledAndProne.prone and          // creature is prone
    grappledAndProne.grappled and        // creature is grappled
    t.effectiveSpeed == 0 and            // grappled ⇒ speed 0
    tAfterStandAttempt == t              // stand attempt is a no-op
  )
}

The same pipeline can be fed game session recordings — another source of real-world rulings to check against the spec.

MBT as a coding agent loop

Run Quint traces in a perpetual loop. When a trace fails, write its seed to a file. A coding agent watches the file — when it’s non-empty, it reproduces the failure, checks with the rules and project coding guidelines, and commits a fix. Rinse, repeat until the file stays empty. Voilà, self-healing system.

Demo: dnd-quint.dearlordylord.com Repo: github.com/dearlordylord/5e-quint MBT bridge: @firfi/quint-connect

I still have no idea what you’re talking about. Also, how do you run “combat” through it?

More on it later: I’ll release a larger and probably a more intermediate or beginner-friendly post.

As to combat, for now, a proof of concept of an educational interactive table of combat statuses already exists for the lighter Savage Worlds RPG — that project of mine was the predecessor to this one.

This table helps me and my group to understand their characters’ options when they are in the pickle.

D&D’s combat is more complex, but the state machine concept is already proven.

D&D is a trademark of Wizards of the Coast. This is unofficial fan content permitted under the Fan Content Policy.

The System Reference Document 5.2.1 — Wizards of the Coast publishes the core D&D rules under CC BY 4.0, so anyone can build on them. ↩
A test oracle is the source of truth that tells you what the correct result should be. In unit tests, it’s the hardcoded expected value. In property-based testing, it’s the invariant. In this project, each layer serves as oracle for the next: community Q&A validates the spec, the spec validates the XState implementation. ↩