Plainy

Recursive planning pipeline of eight typed Anthropic agents: turns a single sentence into a TypeScript-contract-backed TaskBrief that Claude Code translates directly into working code.

Replay: learn Skat Replay: habit tracker Resulting app: skat.prototyp.ms Resulting app: habits.prototyp.ms

Key engineering call

Capture / replay instead of a demo mode. Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path, no drift risk between demo and live. Cost: 'you decide' answers have to be precomputed for replays to be deterministic. Pays off because every live demo automatically becomes a forever-playable demo.

Eight specialised Anthropic agents (intake, architect, skeptic, contract designer, stack, summariser, architecture coach, polish) — each its own tool call with a Zod-validated schema; the architect streams its tool-use JSON incrementally, a bracket scanner extracts complete child objects and emits them as SSE events. A single sentence is refined step by step into a tree of work units, TypeScript contracts (no `any`, no `unknown` — hard-forbidden in the prompt and re-validated through Zod) and a visible decision log. The end product — a markdown TaskBrief — is pasted into a Claude Code session and yields working code. Capture / replay instead of a demo mode: every live run is recordable, replays run deterministically with no Anthropic call.

The full loop — plan to working app

The thesis in one sentence: plainy breaks vague product ideas into artefacts a coding AI can build into running code.

Idea	plainy replay	Resulting app
'An app that teaches me Skat.'	Open replay	skat.prototyp.ms
'A habit tracker for myself.'	Open replay	habits.prototyp.ms

Demo in 60 seconds

1
Type a one-sentence idea.
2
Answer 3–4 intake questions — or let “you decide” fill them in.
3
The tree blooms: 2–5 work units per node, drilled automatically to depth 2.
4
In parallel: the skeptic surfaces one risk, the contract designer writes a TS contract per leaf.
5
Stack card: 2–3 options with pros/cons; every bundled decision lands visibly in the log.
6
Export modal with a markdown TaskBrief — one click, paste into Claude Code, build starts.

Architecture — 8 specialised agents

Each agent is its own Anthropic tool call with a Zod-validated schema. Several run in parallel; the architect streams.

Agent	Model	Job
Einstieg / Intake	Sonnet 4.6	Turns a one-sentence wish into 3–4 sharpening pills / multi-selects.
Architekt / Architect	Sonnet 4.6 · Stream	Splits a node into 2–5 children; flags what should be drilled deeper.
Skeptiker / Skeptic	Sonnet 4.6	Finds one real v1 risk and frames it as a decision. Silence is the default.
Vertragsdesigner / Contract designer	Opus 4.7	One TypeScript contract per leaf — domain-specific types, no any, no unknown.
Stack	Sonnet 4.6	2–3 plain-language stack options with pros/cons; bundles silent decisions.
Plan-Zusammenfasser / Plan summariser	Haiku 4.5	2–3 sentences about the finished project plus a small non-goals list.
Architektur-Coach	Sonnet 4.6	Identifies 1–5 cross-cutting core modules — before any leaf gets implemented.
Polish	Haiku 4.5	Smooths free-text answers into 1–2 clean sentences without altering intent.

Validation & robustness

Streaming architect

The architect streams its tool-use JSON; a bracket-scanner extracts complete child objects and emits them as SSE events — nodes appear in the tree incrementally.

Zod + retry escalation

Every response runs through a Zod schema. On validation failure, a generic retry helper escalates to Opus and includes the error as a correction hint.

Re-entry guard

Prevents the race where skeptic-requested decomposition and auto-recursion would land on the same node simultaneously.

Smart model mix

Haiku for high-frequency micro-text, Sonnet for structural work, Opus where precision matters (contracts). Dispatch lives centrally in server/models.ts, swappable per (agent, operation).

Design notes, briefly

Two layers, one source of truth

PM language up front (title, summary, decisions); engineer view is an optional projection of the same nodes (contracts, acceptance criteria). No second code path — just different slots.

“You decide” as a trust mechanism

Every question has a precomputed AI answer. If used, it lands visibly in the decision log with rationale — no silent decisions, not even from the stack card.

Capture / replay instead of a demo mode

Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path. Cards resolve with pause + highlight + pause.

Contracts without any

The contract designer prompt hard-forbids any, unknown and empty interfaces; Zod validates after. For Skat it’s SkatCard, Reizen, Spitzen — not Card, Bidding, Bonus.

Deployment

Live on a DigitalOcean VPS with PM2 (no cold start). nginx reverse-proxies /api/* to 127.0.0.1:4747, serves dist/ statically, with proxy_buffering off on /api/agent/stream — otherwise the SSE stream arrives buffered and the streaming animation is dead. TLS via certbot/Let's Encrypt with auto-renew. Live mode is gated by an X-Plainy-Auth header against an env var; the replay path is untouched because it never hits the network.