Remix.run Logo
the_duke 4 hours ago

Gemini 3 is pretty good, even Flash is very smart for certain things, and fast!

BUT it is not good at all at tool calling and agentic workflows, especially compared to the recent two mini-generations of models (Codex 5.2/5.3, the last two versions of Anthropic models), and also fell behind a bit in reasoning.

I hope they manage to improve things on that front, because then Flash would be great for many tasks.

chermi 4 hours ago | parent | next [-]

You can really notice the tool use problems. They gotta get on that. The agent trend seems real, and powerful. They can't afford to fall behind on it.

HardCodedBias 27 minutes ago | parent | next [-]

"They can't afford to fall behind on it."

They are very, very seriously far behind as of 3.0.

We'll see if 3.1 addresses the issue at all.

verdverm 4 hours ago | parent | prev [-]

I don't really have tool usage issues that I don't put under that doesn't follow system prompt instructions consistently

there are these times where it puts a prefix on all function calls, which is weird and I think hallucination, so maybe that one

3.1 hopefully fixes that

verdverm 4 hours ago | parent | prev | next [-]

These improvements are one of the things specifically called out on the submitted page

anthonypasq 4 hours ago | parent | prev | next [-]

yeah, it seems to me like Gemini is a little behind on the current RL patterns and also they dont seem interested in really creating a dedicated coding model. I think they have so much product surface (search, AI mode, gmail, youtube, chrome etc), they are prioritizing making the model very general. but who knows im just talking out of my ass.

spwa4 4 hours ago | parent | prev [-]

In other words: they just need to motivate their employees while giving in to finance's demands to fire a few thousand every month or so ...

And don't forget, it's not just direct motivation. You can make yourself indispensable by sabotaging or at least not contributing to your colleagues' efforts. Not helping anyone, by the way, is exactly what your managers want you to do. They will decide what happens, thank you very much, and doing anything outside of your org ... well there's a name for that, isn't there? Betrayal, or perhaps death penalty.