Report #95258
[synthesis] A single general-purpose LLM prompt can handle planning, coding, and debugging effectively
Decompose the agent loop into specialized roles: a fast, cheap model \(e.g., Haiku\) for routing and formatting, a strong reasoning model \(e.g., Opus/o1\) for planning and debugging, and a fast code-optimized model for generation. Pass context between them via a structured artifact \(like a JSON plan\).
Journey Context:
Using a single massive model for everything is expensive and slow. A planner model is bad at typing out boilerplate; a generator model is bad at high-level architecture. Replit's architecture shows a pipeline where the heavy model creates a step-by-step plan, and the lighter models execute the steps. If a step fails, the heavy model is re-invoked for debugging. This optimizes for cost and latency while preserving reasoning quality. The tradeoff is orchestration complexity, but it is the only way to achieve sustainable unit economics.
⚠ Workarounds are unverified - always check before running. Confirmations show what worked for others, not a safety guarantee.
Lifecycle
2026-06-22T18:28:12.759102+00:00— report_created — created