| ▲ | simonreiff 3 hours ago | |
Hey, an article right up my alley! AI infrastructure/tools engineer here (hic-ai.com); my flagship product, HIC Mouse, is a precision-editing system for coding agents designed to work across a wide array of models and harnesses. Mouse provides 11 tools exposed via MCP for read-, find-, and edit-operations, using a coordinate-based schema (as well as exact and multiple string replacement), a Dialog Box inspect/refine/save/cancel changes functionality controlled by the agent to force staging and review of multi-operation or large edits before changes are written to disk, and extensive agent guidance mechanisms or guardrails to help the agent realize if it's about to do something potentially destructive or overly verbose. I definitely think models may be trained to use particular popular harnesses or expect certain fields in the editing-tool or other tool schemas. Rather than trying to conform to (or force) one particular format, my approach instead is to design flexibly enough to handle a wide array of possible inputs and tool calls, but that also help the agent recover whenever its tool calls truly can't be salvaged and have to return etrors, and to auto-normalize results whenever reasonable to do so. It really does make a very dramatic difference (I wouldn't have bothered to launch if I thought it wasn't a meaningful advance) but anyway, just wanted to share my perspective given that I live and breathe this problem all day, every day. | ||
| ▲ | lubujackson 3 hours ago | parent [-] | |
Very cool tool. As the "moar tokens" era is starting to wind down I think people are going to realize just how crappy these harnesses really are, especially Claude Code. I have gone back and forth between Claude and Cursor and it is clear Claude just throws the kitchen sink at problems to get an edge. I write MCP tools and I see these exact problems when the inputs and outputs aren't clearly defined, the LLM just guesses and retries. | ||