Remix.run Logo
akshaysg 4 hours ago

I think Haystack offers:

1. A centralized review mechanism for a team or org that operates on coding agent conversations in addition to diffs (and the codebase). It evaluates multiple different variables (e.g. how sensitive are the changes, how much did the author do to derisk, and what did the author's coding agent gloss over) and helps enforce your team's guidelines moreso than just an individual's prompt

2. Adversarial review that operates in addition to other AI review agents (e.g. BugBot, or Greptile) and filters any comments to only the things the author cares about. This helps cut down on the "AI reviewer battleground" that is present in pull requests

3. A review interface that allows human reviewers to quickly understand what the author did to verify their changes and focus on the author's design decisions

We actually jury-rigged all of this together before building Haystack, but found that it doesn't scale to the team level (since every individual has their own ideas/opinions of what constitutes a human review).

We also found that reviewing through purely Claude Code/Codex was slow and difficult because stuff like author traces are not pre-processed and you have to get your agent to specifically explore/understand them.