Remix.run Logo
davidmckayv 8 hours ago

This really captures something I've been experiencing with Gemini lately. The models are genuinely capable when they work properly, but there's this persistent truncation issue that makes them unreliable in practice.

I've been running into it consistently, responses that just stop mid-sentence, not because of token limits or content filters, but what appears to be a bug in how the model signals completion. It's been documented on their GitHub and dev forums for months as a P2 issue.

The frustrating part is that when you compare a complete Gemini response to Claude or GPT-4, the quality is often quite good. But reliability matters more than peak performance. I'd rather work with a model that consistently delivers complete (if slightly less brilliant) responses than one that gives me half-thoughts I have to constantly prompt to continue.

It's a shame because Google clearly has the underlying tech. But until they fix these basic conversation flow issues, Gemini will keep feeling broken compared to the competition, regardless of how it performs on benchmarks.

https://github.com/googleapis/js-genai/issues/707

https://discuss.ai.google.dev/t/gemini-2-5-pro-incomplete-re...

nico an hour ago | parent | next [-]

Another issue: Gemini can’t do tool calling and (forced) json output at the same time

If you want to use application/json as the specified output in the request, you can’t use tools

So if you need both, you either hope it gives you correct json when using tools (which many times it doesn’t). Or you have to do two requests, one for the tool calling, another for formatting

At least, even if annoying, this issue is pretty straightforward to get around

golfer 7 hours ago | parent | prev | next [-]

Unfortunately Gemini isn't the only culprit here. I've had major problems with ChatGPT reliability myself.

SilverElfin 5 hours ago | parent | next [-]

I think what I am seeing from ChatGPT is highly varying performance. I think this must be something they are doing to manage limitations of compute or costs. With Gemini, I think what I see is slightly different - more like a lower “peak capability” than ChatGPT’s “peak capability”.

Fade_Dance 32 minutes ago | parent [-]

I'm fairly sure there's some sort of dynamic load balancing at work. I read an anecdote from someone had a test where they asked it to draw a little image (something like an ascii cat, but probably not exactly that since it seems a bit basic), and if the result came back poor they didn't bother using it until a different time of day.

Of course it could all be placebo, but when you intuitively think about it, somewhere on the road the the hundreds of billions in datacenter capex, one would think that there will be periods where compute and demand are out of sync. It's also perfectly understandable why now would be a time to be seeing that.

mguerville 6 hours ago | parent | prev [-]

I only hit that problem in voice mode, it'll just stop halfway and restart. It's a jarring reminder of its lack of "real" intelligence

patrickmcnamara 5 hours ago | parent | next [-]

I've heard a lot that voice mode uses a faster (and worse) model than regular ChatGPT. So I think this makes sense. But I haven't seen this in any official documentation.

Narciss 5 hours ago | parent | prev [-]

This is more because of VAD - voice activity detection

drgoogle 23 minutes ago | parent | prev | next [-]

> I've been running into it consistently, responses that just stop mid-sentence

I’ve seen that behavior when LLMs of any make or model aren’t given enough time or allowed enough tokens.

driese 5 hours ago | parent | prev | next [-]

Small things like this or the fact that AI studio still has issues with simple scrolling confuse me. How does such a brilliant tool still lack such basic things?

normie3000 5 hours ago | parent | next [-]

I see Gemini web frequently break its own syntax highlighting.

brap 4 hours ago | parent | prev [-]

The scrolling in AI Studio is an absolute nightmare and somehow they managed to make it worse.

It’s so annoying that you have this super capable model but you interact with it using an app that is complete ass

dorianmariecom 8 hours ago | parent | prev | next [-]

chatgpt also has lots of reliability issues

diego_sandoval 7 hours ago | parent [-]

If anyone from OpenAI is reading this, I have two complaints:

1. Using the "Projects" thing (Folder organization) makes my browser tab (on Firefox) become unusably slow after a while. I'm basically forced to use the default chats organization, even though I would like to organize my chats in folders.

2. After editing a message that you already sent,you get to select between the different branches of the chat (1/2, and so on), which is cool, but when ChatGPT fails to generate a response in this "branched conversation" context, it will continue failing forever. When your conversation is a single thread and a ChatGPT message fails with an error, re trying usually works and the chat continues normally.

porridgeraisin 6 hours ago | parent | next [-]

And 3)

On mobile (android) opening the keyboard scrolls the chat to the bottom! I sometimes want to type referring something from the middle of the LLMs last answer.

Sabinus 4 hours ago | parent [-]

Projects should have their own memory system. Perhaps something more interactive than the existing Memories but projects need their own data (definitions, facts, draft documents) that is iterated on and referred to per project. Attached documents aren't it, the AI needs to be able to update the data over multiple chats.

zarmin 7 hours ago | parent | prev [-]

It would also be nice if ChatGPT could move chats between projects. My sidebar is a nightmare.

throwaway240403 6 hours ago | parent [-]

You can drag and drop chats between projects

zarmin 3 hours ago | parent [-]

i know. i want the assistant to do it. shouldn't it be able to do work on its own platform?

simlevesque 7 hours ago | parent | prev | next [-]

The latest comment on that issue is someone saying there's a fix available for you to try.

m101 6 hours ago | parent | prev | next [-]

I wonder if this is because a memory cap was reached at that output token. Perhaps they route conversations to different hardware depending on how long they expect it to be.

tanvach 5 hours ago | parent | prev | next [-]

Yes agree, it was totally broken when I tested the API two months ago. Lots of failed to connect and very slow response time. Hoping the update fixes these issues.

mattmanser 7 hours ago | parent | prev | next [-]

That used to happen a lot in ChatGPT too.

reissbaker 5 hours ago | parent | prev [-]

FWIW, I think GLM-4.5 or Kimi K2 0905 fit the bill pretty well in terms of complete and consistent.

(Disclosure: I'm the founder of Synthetic.new, a company that runs open-source LLMs for monthly subscriptions.)

noname120 5 hours ago | parent [-]

That’s not a “disclosure”, that’s an ad.