Plainy
Recursive planning pipeline of eight typed Anthropic agents: turns a single sentence into a TypeScript-contract-backed TaskBrief that Claude Code translates directly into working code.
Key engineering call
Capture / replay instead of a demo mode. Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path, no drift risk between demo and live. Cost: 'you decide' answers have to be precomputed for replays to be deterministic. Pays off because every live demo automatically becomes a forever-playable demo.
Eight specialised Anthropic agents (intake, architect, skeptic, contract designer, stack, summariser, architecture coach, polish) — each its own tool call with a Zod-validated schema; the architect streams its tool-use JSON incrementally, a bracket scanner extracts complete child objects and emits them as SSE events. A single sentence is refined step by step into a tree of work units, TypeScript contracts (no `any`, no `unknown` — hard-forbidden in the prompt and re-validated through Zod) and a visible decision log. The end product — a markdown TaskBrief — is pasted into a Claude Code session and yields working code. Capture / replay instead of a demo mode: every live run is recordable, replays run deterministically with no Anthropic call.
The full loop — plan to working app
The thesis in one sentence: plainy breaks vague product ideas into artefacts a coding AI can build into running code.
| Idea | plainy replay | Resulting app |
|---|---|---|
| 'An app that teaches me Skat.' | Open replay | skat.prototyp.ms |
| 'A habit tracker for myself.' | Open replay | habits.prototyp.ms |
Demo in 60 seconds
- 1
Type a one-sentence idea.
- 2
Answer 3–4 intake questions — or let “you decide” fill them in.
- 3
The tree blooms: 2–5 work units per node, drilled automatically to depth 2.
- 4
In parallel: the skeptic surfaces one risk, the contract designer writes a TS contract per leaf.
- 5
Stack card: 2–3 options with pros/cons; every bundled decision lands visibly in the log.
- 6
Export modal with a markdown TaskBrief — one click, paste into Claude Code, build starts.
Architecture — 8 specialised agents
Each agent is its own Anthropic tool call with a Zod-validated schema. Several run in parallel; the architect streams.
| Agent | Model | Job |
|---|---|---|
| Einstieg / Intake | Sonnet 4.6 | Turns a one-sentence wish into 3–4 sharpening pills / multi-selects. |
| Architekt / Architect | Sonnet 4.6 · Stream | Splits a node into 2–5 children; flags what should be drilled deeper. |
| Skeptiker / Skeptic | Sonnet 4.6 | Finds one real v1 risk and frames it as a decision. Silence is the default. |
| Vertragsdesigner / Contract designer | Opus 4.7 | One TypeScript contract per leaf — domain-specific types, no any, no unknown. |
| Stack | Sonnet 4.6 | 2–3 plain-language stack options with pros/cons; bundles silent decisions. |
| Plan-Zusammenfasser / Plan summariser | Haiku 4.5 | 2–3 sentences about the finished project plus a small non-goals list. |
| Architektur-Coach | Sonnet 4.6 | Identifies 1–5 cross-cutting core modules — before any leaf gets implemented. |
| Polish | Haiku 4.5 | Smooths free-text answers into 1–2 clean sentences without altering intent. |
Validation & robustness
Streaming architect
The architect streams its tool-use JSON; a bracket-scanner extracts complete child objects and emits them as SSE events — nodes appear in the tree incrementally.
Zod + retry escalation
Every response runs through a Zod schema. On validation failure, a generic retry helper escalates to Opus and includes the error as a correction hint.
Re-entry guard
Prevents the race where skeptic-requested decomposition and auto-recursion would land on the same node simultaneously.
Smart model mix
Haiku for high-frequency micro-text, Sonnet for structural work, Opus where precision matters (contracts). Dispatch lives centrally in server/models.ts, swappable per (agent, operation).
Design notes, briefly
Two layers, one source of truth
PM language up front (title, summary, decisions); engineer view is an optional projection of the same nodes (contracts, acceptance criteria). No second code path — just different slots.
“You decide” as a trust mechanism
Every question has a precomputed AI answer. If used, it lands visibly in the decision log with rationale — no silent decisions, not even from the stack card.
Capture / replay instead of a demo mode
Every live run is recordable; replays are just a cursor over the fixture with auto-resolve. No parallel demo code path. Cards resolve with pause + highlight + pause.
Contracts without any
The contract designer prompt hard-forbids any, unknown and empty interfaces; Zod validates after. For Skat it’s SkatCard, Reizen, Spitzen — not Card, Bidding, Bonus.
Deployment
Live on a DigitalOcean VPS with PM2 (no cold start). nginx reverse-proxies /api/* to 127.0.0.1:4747, serves dist/ statically, with proxy_buffering off on /api/agent/stream — otherwise the SSE stream arrives buffered and the streaming animation is dead. TLS via certbot/Let's Encrypt with auto-renew. Live mode is gated by an X-Plainy-Auth header against an env var; the replay path is untouched because it never hits the network.