| ▲ | SwellJoe a day ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Anything "for agents" needs to provide some kind of evidence it's better than what the agents already have baked into the model training data. It can't just be "easier" on some dimension, because the model has already learned the hard parts of the old thing and models can't make new memories to learn new things, so there is always a context cost for the new thing. Models know git because there's a monstrous amount of git in their training data. Models never heard of a new thing "for agents", so you have to teach them to use it via skills and docs. Models can, of course, follow documentation, so there's nothing stopping them from using the new thing...but, the new thing "for agents" starts the race well behind the known thing that was built for humans a decade or two ago and has huge amounts of training data baked into every model. I'm not saying nobody should make new things (an accusation I've gotten when saying something similar about a previous "for agents" thing), of course people should make new things. I'm saying that when I see "for agents", I think, "prove it". Agents don't have trouble with git, so there's gotta be some kind of pain point about using git with agents that I'm unaware of that this solves somehow (but isn't expressed on the page) or this isn't actually for agents, it's just a project someone wanted to do (and that's also fine!). But, if the latter, "for agents" is merely marketing and I'm not interested. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | atombender a day ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I'm not sure I understand this argument. I create new tools all the time as part of my development work, and I have skills stored that tell agents how to use them. They use them flawlessly. When I say "benchmark the query engine using the foobar dataset and compare it to run 431", the agents go and run my special benchmark tool and use the different subcommands to compare results and so on. I'm sure a new VCS would be a little less smooth sailing, but not by much. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | skissane a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
> Models know git because there's a monstrous amount of git in their training data. Models never heard of a new thing "for agents", so you have to teach them to use it via skills and docs. Another option: when model invokes standard tool, rewrite the invocation to newfangled tool. Bunch of ways of doing it: (a) Invocation of standard tool returns error saying to use newfangled tool instead (b) Invocation of standard tool returns message saying it has been dynamically rewritten to invoke newfangled tool, followed by newfangled tool output (c) Invocation of standard tool in context is dynamically rewritten to invocation of newfangled tool, prior to execution In case (c), the model ends up thinking it somehow knew about this new thing all along, even though it actually didn’t | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | PashaGo 11 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Totally agree. I used to work with a team that built a project for creating ontologies of Git repositories. The goal was to help LLMs onboard faster and navigate the repo better. In the end, it became heavy overengineering: people no longer understood not only the repo itself, but also the extra layer describing it. Meanwhile, coding assistants are already quite good at reading codebases directly. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | gb2d_hn 19 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Git has worktrees, which provide a means of creating branch linked physical working directories. I built in UI assistance for creating worktress associated with the agent session in https://www.agentkanban.io (an agent integrated kanban board for use with copilot / claude and vs code). I agree, I would rather try and make use of a tool that the agent is already familiar with, unless it's missing features that the agent needs to achieve its goal (which git is not) | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | bottlero_cket 10 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
* It dramatically improves the speed and context your agents need when working on serious projects: 50% fewer VCS-related tokens and 90% faster per operation. Sounds like a good optimization to me. VCS is a waste of tokens for sure. I’m intrigued to hear more. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | codesnik 20 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
yep. claude keeps "habitually" trying to use `rg -rn` instead of `rg -n` because it was instructed to use "rg" instead of "grep" by Anthropic, but uses arguments for grep: `grep -rn`. My instructions and "memory" are not helping. "Oh, I did it again, and you've instructed me not to". Older tools are better for current "agents". | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | hansarnold 18 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
I cant agree more. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | a day ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
| [deleted] | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | mrmrs a day ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||
Totally correct on the burden of proof here. Agents DO know git extremely well. There’s a huge amount of git in model training data, and anything new starts behind because you have to teach the model what it is, what commands to run, and where the sharp edges are. For us “for agents” does not mean “new syntax that we hope agents can read docs for.” The thing we’re trying to optimize is not whether an agent can remember the command. It’s the runtime shape of agent-driven development. When an agent drives a VCS through a captured terminal, things that are tolerable for humans become direct costs: clone/setup time, worktree setup, full status output, huge diffs, branch cleanup, interactive prompts, shared-checkout mutation, repeated preflight checks. Those costs show up as wall time, bytes over the wire, transcript tokens, and recovery steps. So the Oak bet is narrower than “agents can’t use git.” They can. The bet is that if you assume branch-per-agent workflows, lots of parallel sandboxes, large repos, and non-interactive command execution, the VCS interface should have different defaults if you want to optimize for shipping speed and efficiency of token usage. If you're already going fast enough and not running out of tokens - then using oak seems pretty silly. People do not need to ditch git to try Oak out. One workflow we care about is letting agents work in Oak where the agent-specific costs matter, then exporting back to git for the human review, CI, release, or compliance workflows. Totally agree this should be provable and benchmarked. The homepage has Oak vs Git numbers because we do not want “for agents” to just be vibes. We’re measuring transcript bytes, estimated tokens, tool calls, wall time, large diff/status behavior, and contention in agent-style workflows. We’re also working on the benchmarks repo in the open: https://oak.space/oak/benchmarks The exciting part to me is that we can already improve on tokens and timing despite starting with the model-prior deficit you’re describing. If we can win on measured agent workflows while git still has the advantage of being deeply baked into the models, I’m incredibly bullish on where Oak can get to as the tool and the ecosystem matures. Longer term, if Oak proves useful and sticks around, future frontier models will likely have more Oak examples in training data, which lowers the upfront learning tax for an extra boost. | |||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||