Willi Krappen

Plainy

Recursive planning pipeline of eight typed Anthropic agents: turns a single sentence into a TypeScript-contract-backed TaskBrief that Claude Code translates directly into working code.

Key engineering call

Capture / replay instead of a demo mode. Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path, no drift risk between demo and live. Cost: 'you decide' answers have to be precomputed for replays to be deterministic. Pays off because every live demo automatically becomes a forever-playable demo.

Eight specialised Anthropic agents (intake, architect, skeptic, contract designer, stack, summariser, architecture coach, polish) — each its own tool call with a Zod-validated schema; the architect streams its tool-use JSON incrementally, a bracket scanner extracts complete child objects and emits them as SSE events. A single sentence is refined step by step into a tree of work units, TypeScript contracts (no `any`, no `unknown` — hard-forbidden in the prompt and re-validated through Zod) and a visible decision log. The end product — a markdown TaskBrief — is pasted into a Claude Code session and yields working code. Capture / replay instead of a demo mode: every live run is recordable, replays run deterministically with no Anthropic call.

The full loop — plan to working app

The thesis in one sentence: plainy breaks vague product ideas into artefacts a coding AI can build into running code.

Ideaplainy replayResulting app
'An app that teaches me Skat.'Open replayskat.prototyp.ms
'A habit tracker for myself.'Open replayhabits.prototyp.ms

Demo in 60 seconds

  1. 1

    Type a one-sentence idea.

  2. 2

    Answer 3–4 intake questions — or let “you decide” fill them in.

  3. 3

    The tree blooms: 2–5 work units per node, drilled automatically to depth 2.

  4. 4

    In parallel: the skeptic surfaces one risk, the contract designer writes a TS contract per leaf.

  5. 5

    Stack card: 2–3 options with pros/cons; every bundled decision lands visibly in the log.

  6. 6

    Export modal with a markdown TaskBrief — one click, paste into Claude Code, build starts.

Architecture — 8 specialised agents

Each agent is its own Anthropic tool call with a Zod-validated schema. Several run in parallel; the architect streams.

AgentModelJob
Einstieg / IntakeSonnet 4.6Turns a one-sentence wish into 3–4 sharpening pills / multi-selects.
Architekt / ArchitectSonnet 4.6 · StreamSplits a node into 2–5 children; flags what should be drilled deeper.
Skeptiker / SkepticSonnet 4.6Finds one real v1 risk and frames it as a decision. Silence is the default.
Vertragsdesigner / Contract designerOpus 4.7One TypeScript contract per leaf — domain-specific types, no any, no unknown.
StackSonnet 4.62–3 plain-language stack options with pros/cons; bundles silent decisions.
Plan-Zusammenfasser / Plan summariserHaiku 4.52–3 sentences about the finished project plus a small non-goals list.
Architektur-CoachSonnet 4.6Identifies 1–5 cross-cutting core modules — before any leaf gets implemented.
PolishHaiku 4.5Smooths free-text answers into 1–2 clean sentences without altering intent.

Validation & robustness

Streaming architect

The architect streams its tool-use JSON; a bracket-scanner extracts complete child objects and emits them as SSE events — nodes appear in the tree incrementally.

Zod + retry escalation

Every response runs through a Zod schema. On validation failure, a generic retry helper escalates to Opus and includes the error as a correction hint.

Re-entry guard

Prevents the race where skeptic-requested decomposition and auto-recursion would land on the same node simultaneously.

Smart model mix

Haiku for high-frequency micro-text, Sonnet for structural work, Opus where precision matters (contracts). Dispatch lives centrally in server/models.ts, swappable per (agent, operation).

Design notes, briefly

Two layers, one source of truth

PM language up front (title, summary, decisions); engineer view is an optional projection of the same nodes (contracts, acceptance criteria). No second code path — just different slots.

“You decide” as a trust mechanism

Every question has a precomputed AI answer. If used, it lands visibly in the decision log with rationale — no silent decisions, not even from the stack card.

Capture / replay instead of a demo mode

Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path. Cards resolve with pause + highlight + pause.

Contracts without any

The contract designer prompt hard-forbids any, unknown and empty interfaces; Zod validates after. For Skat it’s SkatCard, Reizen, Spitzen — not Card, Bidding, Bonus.

Deployment

Live on a DigitalOcean VPS with PM2 (no cold start). nginx reverse-proxies /api/* to 127.0.0.1:4747, serves dist/ statically, with proxy_buffering off on /api/agent/stream — otherwise the SSE stream arrives buffered and the streaming animation is dead. TLS via certbot/Let's Encrypt with auto-renew. Live mode is gated by an X-Plainy-Auth header against an env var; the replay path is untouched because it never hits the network.