A lending world model, built one piece at a time. Each piece below is one visual + one idea.
In one line — Don't make an AI decide every loan by hand. Make it write the rulebook,
check the rulebook, then let cheap rules do the work. New to this? Start right below.
In a hurry? skip to the game ↓.
START HERE · PLAIN WORDS
Three words you'll need. No finance degree required.
A loanA bank gives you money for a house; you pay it back monthly, for years. The bank's job: guess if you'll actually pay it back.
An "agent"Here it just means an automated decision-maker — a piece of software that says approve / deny on each loan. It can be simple rules, or an AI.
The castborrower wants the loan → broker finds them & passes it on → lender says yes/no → investor buys the loan. Risk flows down this chain.
→ The whole question: who (or what) should make the yes/no call — and can you trust it?
MEET THE PLAYERS — every one is an agent
Four players in the chain. Each is its own agent — an automated decision-maker with its own goal. None of them is a person; all of them are software making calls.
wins: deals close · loses: pushes fraud → clawed back
Lender agent
decides: approve / deny / refer
Wants: fund good loans, reject bad ones.
Bind: too strict → loses business; too loose → eats defaults.
wins: grows safely · loses: broke / breaks rules
Bank / Investor agent
decides: how much credit? repurchase?
Wants: steady returns, no nasty surprises.
Power: a loan built on a lie gets sent back.
wins: loans pay off · loses: a wave sours at once
→ Four agent types. And each can run on any of four brains ↓
Watch them play.Notice: a loan flows down the chain — then one defaults, and everyone reacts.
AND EACH AGENT HAS A BRAIN
Same agent, four ways to make it decide. This is the real experiment: 4 player types × 4 brains.
① Plain rules
A fixed rulebook. Free, instant, fully readable. Never breaks the rules — but never improvises.
② AI picks, rules guard
The AI chooses each move, but only from legal options. Flexible and can't cheat. Costs a little per decision.
③ AI writes the rules ★
The AI writes the rulebook once; we check it; then plain rules run it free. Best of both — the punchline of this post.
④ AI decides freely
No rails. Most flexible, most expensive, and it sometimes breaks the rules outright.
→ The game runs all four player types, and you can swap any of them onto any of these four brains. Now let's watch.
STEP 1 · SEE THE DECISION
Build an applicant yourself — drag the two dials. watch all four "brains" decide, live.
You decide who the applicant is.Notice: the rules give the same call every time; the free LLM sometimes disagrees — on the exact same person.
→ Same applicant, four brains, four different answers. Now: which one could you defend to a regulator?
Below, the actual recorded reasoning for two of them on one loan — rules land in 5 clean steps; the unconstrained LLM wanders 40+ to the same place.
Same loan, two reasonings.
→ If you can't read how it decided, you can't trust or regulate it.
What the steps mean (plain version)
income (average two years of pay) → can they afford it? (does the payment eat too much
of that income — lenders call this DTI) → do they have savings as a cushion? → decision.
These are the real checks a human underwriter runs; the rules just do them for free.
One loan, fine. But what happens at forty thousand? ↓
STEP 2 · COST IT AT SCALE
The rule planner is "free" — 0 tokens. So it's the cheapest, right? drag the loan count and watch.
Cost at 40,000 loans.Notice: the per-loan LLM columns explode with volume; the generated policy stays flat.
→ "Free" just moves the cost. A model is only honest once it charges for that.
So far it's a quiz. Real loans don't resolve at the closing table. ↓
STEP 3 · GIVE IT A CYCLE
Nobody sets a "boom" or a "bust" — the agents just act, and the weather (the colored bands) emerges. All four agent types, over six years:
All four agents, one world.Notice: borrower demand and lender health rise together in the boom — then they're chained, so they fall together too.
Your turn — drag the lending dial.Nobody dialed a "boom." Loosen credit and the bubble inflates itself; the looser you lend, the harder the crash.
→ A decision isn't real until it has a delayed, coupled consequence — and every agent feels every other's.
Zoom all the way in — what does one loan's whole life look like? ↓
STEP 4 · FOLLOW ONE LOAN
A broker pushes a shaky file through. Who's holding the bag when it blows up? read to the last row
The life of a defected loan.Notice: it performs for 18 months — then the loss flows back up the chain.
→ Once a defect gets put back and the commission clawed back, incentives bite.
Now the real question: can an LLM write rules this good? ↓
STEP 5 · GENERATE THE POLICY
Hand the LLM the policy and say "write the rulebook." hit 🎲 — make it write a fresh one a few times.
Free-write vs tune-and-validate.Notice: every time it free-writes raw rules it scores near-random; constrain it to tune parameters we check, and it nails 100% — every time.
→ The win isn't "more LLM." It's generate → validate → run free.
STEP 6 · THE REAL POLICIES
Real mortgages come in types (government-backed, low-down-payment, rural, rental…), each with its own rulebook. The AI writes all of them; we check them.
The rulebooks the AI generated, one per loan type.Notice: each row is a different real-world program's limits — one engine runs them all.
→ The "approve" rule literally can't fire unless the file meets the limits — so the rules can't approve something the guide forbids.
See one agency as a real GOAP rulebook + its plans
The FNMA GOAP rulebook — from the Fannie Mae Selling Guide, $0 to run
This is a real run: 30 years of loans, every approve/deny/refer made by an actual LLM and recorded. Scrub the timeline; tap AGENTS to open any of the four — borrower, broker, lender, bank — and read its reasoning.
The Macro Arena — real-LLM underwriting, all four agents, replayed