Remix.run Logo
faangguyindia 5 days ago

>Gemini CLI often makes incorrect edits and gets confused

Gemini CLI still uses archaic whole file format for edits, it's not a good representative of current state of coding agents.

lifthrasiir 4 days ago | parent | next [-]

I'm not sure what do you mean by "whole file format", but if it refers to the write_file tool that overwrites the whole file, there is also the replace tool which is apparently inspired by a blog post [1] by Anthropic. It seems that Claude Code also supports the roughly identical tool (inferred from error messages), so editing tools can't be the reason why Claude Code is good.

[1] https://www.anthropic.com/engineering/swe-bench-sonnet

faangguyindia 4 days ago | parent [-]

Many agents can send diffs. Whole file reading and writing burns tokens and pollutes context.

lifthrasiir 4 days ago | parent [-]

The replace tool is a form of diff (although it's rudimentary), and the read_file tool can be called with line ranges. I do wish robust patching but it is not the "whole" file reading/writing. Maybe you wanted to say about subagent file handling? I can agree then.

(Also I think Gemini is significantly better when it comes to the context rot, in my experience 100K--300K tokens were required for symptoms to appear. So burning tokens is less problematic with Gemini.)

cryptoz 5 days ago | parent | prev [-]

Oh that's wild, I did suspect that but didn't know it outright. Mind-blowing Google would release that kind of thing, I had wondered why it sucked so much haha. Okay so what is a good representation of the current state of coding agents? Which one should I try that does a better job at code modifications?

NitpickLawyer 5 days ago | parent | next [-]

Claude code is the strongest atm, but roocode or cline (vscode extensions) can also work well. Roo with gpt5-mini (so cheap, pretty fast) does diff based edits w/ good coordination over a task, and finishes most tasks that I tried. It even calls them "surgical diffs" :D

mrugge 5 days ago | parent | prev [-]

claude code (with max subscription), cursor-agent (with usage based pricing)