| ▲ | evanklem2004 4 hours ago | |
Built this as an opinionated Claude Code development flow based on evidence based practices and what has been working for me while developing professional code. EvanFlow is a single TDD-driven loop. Say "let's evanflow this" and it walks brainstorm → plan → execute → tdd → iterate → STOP. Real checkpoints at design and plan approval. Never auto-commits, never auto-stages, never proposes integration - every git op is your call. The three things that actually changed how I work: 1. Vertical-slice TDD. One failing test → minimal impl → next test. Watch each test fail before writing the impl that passes it. (Sounds obvious. Almost no agent does it by default. ~62% of LLM-generated test assertions are wrong per HumanEval research, so testing TDD discipline matters more than the impl discipline.) 2. Embedded grilling at decision points. Before locking a plan: what breaks if a user does X? What's the rollback? What's explicitly out of scope? Catches design flaws while they're still cheap. 3. Iterate-until-clean (hard cap of 5 rounds). Re-read the diff against dead code, naming, the deletion test, assertion correctness, and a Five Failure Modes pass (hallucinated actions, scope creep, cascading errors, context loss, tool misuse). For UI: screenshot via headless Chromium. For bigger plans with 3+ independent units sharing types, it forks into a parallel coder/overseer orchestration. Integration tests at touchpoints ARE the cohesion contract. Three install paths: Claude Code plugin marketplace, npx skills add, manual copy. MIT. | ||
| ▲ | dpark 31 minutes ago | parent | next [-] | |
I’ve thought of going down the TDD model for LLMs as a way of providing constraints on their behavior. I would think that “vertical slice” TDD would encourage the LLM to start tailoring the tests to the implementation rather than establishing the invariants up front, though. I was considering “horizontal” TDD to force the agent to implement constraints before coding to them. | ||
| ▲ | girvo 26 minutes ago | parent | prev | next [-] | |
Please don’t post AI generated comments :( Just write it yourself. I promise it’s worth it | ||
| ▲ | lukewrites 22 minutes ago | parent | prev [-] | |
Curious, In the repo you mention > Several rules come from 2025-2026 industry research on agentic coding failure modes What are some of the papers you read? | ||