Remix.run Logo
wenc 3 hours ago

I use Kiro IDE (≠ Kiro CLI) primarily as a spec generator.

In my experience, it's high-quality for creating and iterating on specs. Tools like Cursor are optimized for human-driven vibing -- they have great autocomplete, etc. Kiro, by contrast, is optimized around spec, which ironically has been the most effective approach I've found for driving agents.

I'd argue that Cursor, Antigravity, and similar tools are optimized for human steering, which explains their popularity, while Kiro is optimized for agent harnesses. That's also why it’s underused: it's quite opinionated, but very effective. Vibe-coding culture isn't sold on spec driven development (they think it's waterfall and summarily dismiss it -- even Yegge has this bias), so people tend to underrate it.

Kiro writes specs using structured formats like EARS and INCOSE. It performs automated reasoning to check for consistency, then generates a design document and task list from the spec -- similar to what Beads does. I usually spend a significant amount of time pressure-testing the spec before implementing (often hours to days), and it pays off. Writing a good, consistent spec is essentially the computer equivalent of "writing as a tool of thought" in practice.

Once the spec is tight, implementation tends to follow it closely. Kiro also generates property-based tests (PBTs) using Hypothesis in Python, inspired by Haskell's QuickCheck. These tests sweep the input domain and, when combined with traditional scenario-based unit tests, tend to produce code that adheres closely to the spec. I also add a small instruction "do red/green TDD" (I learned this from Simon Willison) and that one line alone improved the quality of all my tests.

Kiro can technically implement the task list itself, but this is where agents come in. With the spec in hand, I use multiple headless CLI agents in tmux (e.g., Kiro CLI, Claude Code) for implementation. The results have been very good. With a solid Kiro spec and task list, agents usually implement everything end-to-end without stopping -- I haven’t found a need for Ralph loops. (agents sometimes tend to stop mid way on Claude plans, but I've never had that happen with Kiro, not sure why, maybe it's the checklist, which includes PBT tests as gates).

Kiro didn't have the strongest start, but the Kiro IDE is one of the best spec generators I've used, and it integrates extremely well with agent-driven workflows.