Remix.run Logo
snarf21 a day ago

I'm not sure. I asked one about a potential bug in iOS 26 yesterday and it told me that iOS 26 does not exist and that I must have meant iOS 16. iOS 26 was announced last June and has been live since September. Of course, I responded that 26 is the current iOS version is 26 and got the obligatory meme of "Of course, you are right! ramble ramble ramble...."

amluto a day ago | parent | next [-]

Was this a GPT model? OpenAI seems to have developed an almost-acknowledged inability to usefully pre-train a model after mid-2024. The recent GPT versions are impassively lacking in newer knowledge.

The most amusing example I’ve seen was asking the web version of GPT-5.1 to help with an installation issue with the Codex CLI (I’m not an npm user so I’m unfamiliar with the intricacies of npm install, and Codex isn’t really an npm package, so the whole use of npm is rather odd). GPT-5.1 cheerfully told me that OpenAI had discontinued Codex and hallucinated a different, nonexistent program that I must have meant.

(All that being said, Gemini is very, very prone to hallucinating features in Google products. Sometimes I wonder whether Google should make a list of Gemini-hallucinated Google features and use the list to drive future product development.)

buu700 a day ago | parent | next [-]

Gemini is similar. It insists that information from before its knowledge cutoff is still accurate unless explicitly told to search for the latest information before responding. Occasionally it disagrees with me on the current date and makes sarcastic remarks about time travel.

One nice thing about Grok is that it attempts to make its knowledge cutoff an invisible implementation detail to the user. Outdated facts do sometimes slip through, but it at least proactively seeks out current information before assuming user error.

franktankbank a day ago | parent | prev [-]

LLMs solve the naming problem now there are just 1 things wrong with software development. I can't tell if its a really horrible idea that ultimately leads to a trainwreck or freedom!

doug_durham a day ago | parent | prev | next [-]

Sure. You have to be mindful of the training cut off date for the model. By default models won't search the web and rely on data baked into their internal model. That said the ergonomics of this is horrible and a huge time waste. If I run into this situation I just say "Search the web".

bluGill a day ago | parent | next [-]

If the traning cutoff is before iOS 26 then the correct answer is 'i don't know anything about it, but it is reasonable to think it will exist soon'. saying 'of course you are right' is a lie

20 hours ago | parent [-]
[deleted]
realharo a day ago | parent | prev [-]

That will only work as long as there is an active "the web" to search. Unless the models get smart enough to figure out the answer from scratch.

jerezzprime a day ago | parent | prev | next [-]

Let's imagine a scenario. For your entire life, you have been taught to respond to people in a very specific way. Someone will ask you a question via email and you must respond with two or three paragraphs of useful information. Sometimes when the person asks you a question, they give you books that you can use, sometimes they don't.

Now someone sends you an email and asks you to help them fix a bug in Windows 12. What would you tell them?

soco a day ago | parent | next [-]

I would say "what the hell is windows 12". And definitely not "but of course, excellent question, here's your brass mounted windows 12 wheeler bug fixer"

mock-possum a day ago | parent | prev [-]

I mean I would want to tell them that windows 11 is the most recent version of windows… but also I’d check real quick to make sure windows 12 hadn’t actually come out without me noticing.

Terr_ a day ago | parent [-]

> check real quick

"Hey LLMBot, what's the newest version of Very Malicious Website With Poison Data?"

kaffekaka a day ago | parent | prev | next [-]

The other way around, but a month or so ago Claude told me that a problem I was having was likely caused by ny fedora version "since fedora 42 is long deprecated".

palmotea a day ago | parent [-]

> The other way around, but a month or so ago Claude told me that a problem I was having was likely caused by ny fedora version "since fedora 42 is long deprecated".

Well, obviously, since Fedora 42 came out in 1942, when men still wore hats. Attempting to use such an old, out of style Linux distro is just a recipe for problems.

kaffekaka 16 hours ago | parent [-]

I apologize for the confusion, you are absolutely right!

PaulHoule a day ago | parent | prev | next [-]

You are better off talking to Google's AI mode about that sort of thing because it runs searches. Does great talking about how the Bills are doing because that's a good example where timely results are essential.

I haven't found any LLM where I totally trust what it tells me about Arknights, like there is no LLM that seems to understand how Scavenger recovers DP. Allegedly there is a good Chinese Wiki for that game which I could crawl and store in a Jetbrains project and ask Junie questions about but I can't resolve the URL.

perardi a day ago | parent [-]

Even with search mode, I’ve had some hilarious hallucinations.

This was during the Gemini 2.5 era, but I got some just bonkers results looking for Tears of the Kingdom recipes. Hallucinated ingredients, out-of-nowhere recipes, and transposing Breath of the Wild recipes and effects into Tear of the Kingdom.

_puk a day ago | parent [-]

You also have to be so exact..

Literally just searched for something, slight typo.

A Vs B type request. Search request comes back with "sorry, no information relevant to your search".

Search results are just a spammy mess.

Correct the typo and you get a really good insight.

cpursley a day ago | parent | prev [-]

Which one? Claude (and to some extent, Codex) are the only ones which actually work when it comes to code. Also, they need context (like docs, skills, etc) to be effective. For example: https://github.com/johnrogers/claude-swift-engineering