Remix.run Logo
mywittyname 2 hours ago

> As much as I hate to admit it, step one in most of my projects now is to ask AI about it. Maybe it’ll tell me something I don’t know.

Or, more likely, it will tell you something it doesn't know.

Reminds me of yesterday, when I was arguing with ChatGPT that the 5070TI was an actual video card. It kept trying to correct me by saying I must have meant a 4070ti, since no such 5070ti card exists.

collabs an hour ago | parent | next [-]

Or, it will acknowledge that it made a mistake and continue to make the same mistake again.

I asked Claude to generate an HTML page about PowerShell 7. It gave me a page saying 7.4 was the latest LTS release. I corrected it with links showing 7.6 was released in March and asked it to regenerate with the latest information.

It generated basically the same page with the same claim that 7.4 was the latest release.

ericmay an hour ago | parent [-]

> Or, it will acknowledge that it made a mistake and continue to make the same mistake again.

People do this too though. At least the AI generally tries to follow instructions that you give it even when you are lacking clarity in the details.

I feel like it's similar to the self-driving car problem. The car could have 99.9999% reliability, drive much better and safer than a human, yet folks will still freak out about a single mistake that's made even though you have actual humans today driving the wrong way down the highway, crashing in to buildings, drunk driving, stealing cars, and all sorts of other just absolutely stupid things.

We need to move away from this idea that because it's an AI system it should give you perfect responses. It's not a deterministic system and it can be wrong, though it should get better over time. Your Google search results are wrong all the time too. The NYT writes things that are factually incorrect. Why do we have such a high standard for these models when we don't apply them elsewhere?

applfanboysbgon 35 minutes ago | parent | next [-]

> Your Google search results are wrong all the time too. The NYT writes things that are factually incorrect.

This is also very bad and people complain about these things all the fucking time.

> Why do we have such a high standard for these models

Because Altman and Amodei are defrauding investors out of hundreds of billions of dollars on the promise that they will replace the entire workforce. Of course people are going to point out the emperor has no clothes when half of our society is engaged in mass hysteria worshipping these fucking things as the next industrial revolution, diverting massive amounts of resources to them, and ruining HN with 10 articles on the front page per day about how software engineering is dead.

dvlsg 19 minutes ago | parent | next [-]

> ruining HN with 10 articles on the front page per day about how software engineering is dead.

Even this article, which is theoretically about playing games on a MacBook and not about AI, has devolved into AI discussions. It's honestly kind of tiring.

I suppose the article invites it by putting an AI blurb up top, and I suppose I'm also not helping by adding my own comment, but _still_.

ericmay 29 minutes ago | parent | prev [-]

> This is also very bad and people complain about these things all the fucking time.

So at worst these AI tools are as bad as the existing system. Worth complaining about? Absolutely. Worth holding to much higher standards? Nah I don't think so. Not at this stage at least. And folks are just disappointing themselves by setting up straw men expectations.

These tools are non-deterministic systems (like humans) which sometimes don't do exactly what you want (like humans) but are also extremely fast, much cheaper (for now), and have domain knowledge generation that is much broader than any single human has. Like anything else, there are pros and cons.

applfanboysbgon 26 minutes ago | parent [-]

They aren't "straw man expectations" when the entire US economy is now oriented around those expectations.

bryceacc an hour ago | parent | prev [-]

>I corrected it with links

it should be reasonably expected that you can give a source and fix an error in the AI output.

I would even go as far as to say if a human directly told the AI "no, use 7.6 as the latest version", the AI should absolutely follow direct instructions no matter what it thinks is true. What if this human was working on a slide about the upcoming release of 7.6 that has no public documentation?

corry an hour ago | parent | prev | next [-]

LLMs are (broadly-speaking) poorly-positioned to give you a strong verdict on plausibility of a frontier topic. That said - ChatGPT was exactly right in its response to OP!

"Very deep", "border-line impractical" "in a research-sense" is the perfect summary of this article itself! :)

perarneng an hour ago | parent | prev | next [-]

This is why i use grok expert mode. It agressivly goes out searching the web for info. Its so much better then relying on year old data.

_blk an hour ago | parent [-]

Yes, I really like that about Grok. It had a few good qualities but it was too verbose so now it's mostly Claude.

JumpCrisscross an hour ago | parent [-]

Solid compromise is Kagi's research assistant. Aggressively cites, unlike Claude. Concise, unlike Grok.

funimpoded an hour ago | parent | prev | next [-]

Watching the entire economy of a superpower and ~all of online culture go absolutely ga-ga over Furbys has been one of the weirdest things I've ever witnessed.

Apocryphon 33 minutes ago | parent [-]

Eh, in this use case it's more like a goofy search engine.

amluto an hour ago | parent | prev | next [-]

At least ChatGPT is now aware that Codex exists. I have a chat, still in my history, from a few months ago, in which I asked for help wrangling npm to get @openai/codex working, and ChatGPT said:

> Important: Codex CLI no longer exists

> OpenAI discontinued the Codex model + CLI a while back. There is no official binary named codex in any current OpenAI npm packages. OpenAI’s current CLI tool is:

    npm install -g openai
> which installs the openai command, not codex.

The world knowledge of these models is not necessarily up to date :)

edit: I replayed the same prompt into current ChatGPT and it is less clueless now. Maybe OpenAI noticed that it was utterly dumb that GPT-5.whatever didn't believe that Codex existed and fine-tuned it.

sigmoid10 an hour ago | parent [-]

>The world knowledge of these models is not necessarily up to date :)

It's amazing how this still needs to be said. Codex was released in April 2025. The initial GPT-5 and 5.1 still had a knowledge cutoff in late 2024. Like, what did you expect? Always beware the knowledge cutoff for LLMs (although recent releases have gotten much better with researching the web for updates before answering modern software topics).

simonh an hour ago | parent | prev [-]

It’s training data only goes up to late 2024 or early 2025 so that might be why, though it does have access to the internet.

mywittyname an hour ago | parent | next [-]

Yeah, the solution was to link it to the nvidia page of the card, then it was like, 'oh, okay.' But at that point, I lost faith in it's ability to provide me with the information I was looking for. If it's information is so out of date that it doesn't know about the 5000 series, how could I be confident that it knew the details I was asking about (game engine related research)?

asats an hour ago | parent [-]

Are you using the instant model?

weird-eye-issue an hour ago | parent | prev [-]

Depending on your ChatGPT settings...