Remix.run Logo
BrokenCogs 5 days ago

How good is Gemini CLI compared to Claude code and openAi codex?

Frannky 5 days ago | parent | next [-]

I started with Claude Code, realized it was too much money for every message, then switched to Gemini CLI, then Qwen. Probably Claude Code is better, but I don't need it since I can solve my problems without it.

distances 5 days ago | parent | next [-]

I've found the regular Claude Pro subscription quite enough for coding tasks when you anyway have a bunch of other things like code reviews to do in addition to coding, and won't spend the whole day running it.

luxuryballs 5 days ago | parent | prev | next [-]

Yeah I was using openrouter for Claude code and burned through $30 in credits to do things that if I had just used the openrouter chat for it would have been like $1.50, I decided it was better for now to do the extra “secretary work” of manual entry and context management of the chat and pain of attaching files. It was pretty disappointing because at first I had assumed it would not be much different in price at all.

koreth1 5 days ago | parent [-]

This is an interesting way to look at it because you can kind of quantify the tradeoff in terms of the value of your time. A simple analysis would be something like, if you value your time at $60/hour, then spending an additional $30 in credits becomes a good choice if it saves you more than a half-hour of work.

cmrdporcupine 5 days ago | parent | prev [-]

Try what I've done: use the Claude Code tool but point your ANTHROPIC_URL at a DeepSeek API membership. It's like 1/10th the cost, and about 2/3rds the intelligence.

Sometimes I can't really tell.

ewoodrich 5 days ago | parent | next [-]

Also there is a fixed price Claude Code plan for GLM 4.6 from z.ai, I pay for the cheapest ($6/mo) as an alternate/fallback for Claude Code and Codex. I've been surprised by how similar in capabilities all three of them are, definitely not seeing a big CLI agent moat...

GLM is maybe slightly weaker on average but on the other hand it's also solved problems where both CC and Codex got stuck in endless failure loops so for the price it's nice to have in my back pocket. I also see some tool use failures sometimes that it always works around which I'm guessing are due to slight differences with Claude.

https://z.ai/subscribe

behnamoh 5 days ago | parent | prev | next [-]

I use this to proxy ANTHROPIC_BASE_URL to other models: https://github.com/ujisati/claude-code-provider-proxy

unfortunately it doesn't support local models but they're too slow for coding anyway.

cmrdporcupine 5 days ago | parent [-]

I've used that too, but in DeepSeek's case they provide an Anthropic API compatible endpoint so you don't have to.

jen729w 5 days ago | parent | prev [-]

Or a 3rd party service like https://synthetic.new, of which I am an unaffiliated user.

cmrdporcupine 5 days ago | parent | next [-]

So, Deepseek 3.1 from their own platform:

Input $0.28 / 1M tokens cache miss Output $0.42 / 1M tokens

Via synthetic (which otherwise looks cool):

Input $0.56/mtok Output $1.68/mtok

So 2-3 better value through https://platform.deepseek.com

(Granted Synthetic gives you way more models to choose from, including ones that don't parrot CPC/PLA propaganda and censor)

ojosilva 5 days ago | parent | prev [-]

I'm using CC + GLM 4.6 from Synthetic and results are top notch for $60/mo, speed is fast and servers are close to home than z.ai

nl 5 days ago | parent | prev | next [-]

Not great.

It's ok for documentation or small tasks, but consistently fails at tasks that both Claude or Codex succeed at.

wdfx 5 days ago | parent | prev [-]

Gemini and it's tooling is absolute shit. The LLM itself is barely usable and needs so much supervision you might as well do the work yourself. Then couple that with an awful cli and vscode interface and you'll find that it's just a complete waste of time.

Compared to the anthropic offering is night and day. Claude gets on with the job and makes me way more productive.

Frannky 5 days ago | parent | next [-]

It's probably a mix of what you're working on and how you're using the tool. If you can't get it done for free or cheaply, it makes sense to pay. I first design the architecture in my mind, then use Grok 4 fast (free) for single-shot generation of main files. This forces me to think first, and read the generated code to double-check. Then, the CLI is mostly for editing, clerical work, testing, etc. That said, I do try to avoid coding altogether if the CLI + MCP servers + MD files can solve the problem.

SamInTheShell 5 days ago | parent | prev [-]

> Gemini and it's tooling is absolute shit.

Which model were you using? In my experience Gemini 2.5 Pro is just as good as Claude Sonnet 4 and 4.5. It's literally what I use as a fallback to wrap something up if I hit the 5 hour limit on Claude and want to just push past some incomplete work.

I'm just going to throw this out there. I get good results from a truly trash model like gpt-oss-20b (quantized at 4bits). The reason I can literally use this model is because I know my shit and have spent time learning how much instruction each model I use needs.

Would be curious what you're actually having issues with if you're willing to share.

sega_sai 5 days ago | parent | next [-]

I share the same opinion on Gemini cli. Other than for simplest tasks it is just not usable, it gets stuck in loops, ignores instructions, fails to edit files. Plus it just has a plenty of bugs in the cli that you occasionally hit. I wish I could use it rather than pay an extra subscription for Claude Code, but it is just in a different league (at least as recently as couple of weeks ago)

SamInTheShell 5 days ago | parent [-]

Which model are you using though? When I run out of Gemini 2.5 Pro and it falls back to the Flash model, the Flash model is absolute trash for sure. I have to prompt it like I do local models. Gemini 2.5 Pro has shown me good results though. Nothing like "ignores instructions" has really occurred for me with the Pro model.

sega_sai 5 days ago | parent [-]

I get that even with the 2.5 pro

SamInTheShell 5 days ago | parent [-]

That's weird. I can prompt 2.5 Pro and Claude Sonnet 4.5 about the same for most typescript problems and they end up doing about the same. I get different results with Terraform though, I think Gemini 2.5 Pro does better on some Google Cloud stuff, but only on the specifics.

Is just strange to me that my experience seems to be a polar opposite of yours.

sega_sai 5 days ago | parent [-]

I don't know. The last problem I tried was a complex one -- migration of some scientific code from CPU to GPU. Gemini was useless there, but Claude proposed realistic solutions and was able to explore and test those.

nl 5 days ago | parent | prev [-]

I think you must be using it quite differently to me.

I can one-shot new webapps in Claude and Codex and can't in Gemini Pro.

SamInTheShell 5 days ago | parent [-]

The type of stuff I tend to do is much more complex than a simple website. I really can't rely on AI as heavily for stuff that I really enjoy tinkering with. There's just not enough data for them to train on to truly solve distributed system problems.