Remix.run Logo
IshKebab 5 days ago

Exactly. It feels like with LLMs as soon as we achieved the at-the-time astounding breakthrough "LLMs can generate coherent stories" with GPT-2, people have constantly been like "yeah? Well it can't do <this thing that is really hard even for competent humans>.".

That breakthrough was only 6 years ago!

https://openai.com/index/better-language-models/

> We’ve trained a large-scale unsupervised language model which generates coherent paragraphs of text...

That was big news. I guess this is because it's quite hard for the most people to distinguish the enormous difficulty gulf between "generate a coherent paragraph" and "create a novel funny joke".

brookst 5 days ago | parent [-]

Same thing we saw with game playing:

- It can play chess -> but not at a serious level

- It can beat most people -> but not grandmasters

- It can beat grandmasters -> but it can’t play go

…etc, etc

In a way I guess it’s good that there is always some reason the current version isn’t “really” impressive, as it drives innovation.

But as someone more interested in a holistic understanding of of the world than proving any particular point, it is frustrating to see the goalposts moved without even acknowledging how much work and progress were involved in meeting the goalposts at their previous location.

nothrabannosir 5 days ago | parent [-]

> it is frustrating to see the goalposts moved without even acknowledging how much work and progress were involved in meeting the goalposts at their previous location.

Half the HN front page for the past years has been nothing but acknowledging the progress of LLMs in sundry ways. I wish we actually stopped for a second. It’s all people seem to want to talk about anymore.

brookst 5 days ago | parent [-]

I should have been more clear. Let me rephrase as: among those who dismiss the latest innovations as nothing special because there is still further to go, it would be nice to acknowledgment when goalposts are moved.

nothrabannosir 4 days ago | parent [-]

Maybe the people raving about LLM progress are the same people holding them to those high standards?

I don’t see what’s inconsistent about it. “Due to this latest amazing algorithm, the robots keep scoring goals. What do we do? Let’s move them back a bit!” Seems like a normal way of thinking to me…

I see people fawn over technical progress every day. What are they supposed to do, stop updating their expectations and never expect any more progress?

It could of course be that there are people who “never give it up for the robots”. Or maybe they do, and they did, and they have so fully embraced the brave new world that they’re talking about what’s next.

I mean, when I sit in a train I don’t spend half the ride saying “oh my god this is incredible, big thanks to whoever invented the wheel. So smooth!”

Even though maybe I should :)

brookst 4 days ago | parent [-]

> I mean, when I sit in a train I don’t spend half the ride saying “oh my god this is incredible, big thanks to whoever invented the wheel. So smooth!”

Two thoughts:

- In that context, neither do you expect people to be invested in why the train is nothing special, it’s basically a horse cart, etc, etc

- And maybe here’s where I’m weird: I often am overcome by the miracle of thousands of tons of metal hurtling along at 50 - 200mph, reliably, smoothly enough to work or eat, many thousands of times a day, for pennies per person per mile. I mean, I’ll get sucked in to how the latches to release the emergency windows were designed and manufactured at scale despite almost none of them ever being used. But maybe that’s just me.

nothrabannosir 4 days ago | parent [-]

Louis CK did a bit on this: https://www.youtube.com/watch?v=PdFB7q89_3U :)

My point isn’t that other people shouldn’t be amazed, it’s that I see this recurring assumption they aren’t. How do you know the people holding LLMs to higher standards aren’t also the same people who herald the dawn of a new AI era?

Emphasis in the text you quoted: “saying”, not “thinking”.

brookst 3 days ago | parent [-]

My point was more that there is a subset of technical people who delight in the “they’re not perfect, because / therefore they are just glorified spellcheck” fallacy. Search this thread for “spell” or “parrot” to see examples.

So I don’t think it’s the same people, because the tone is not “they’re amazing but have farther to go”; there is a substantial group who at least claims to believe there’s no qualitative difference between Opus 4.1 and the spellcheck in Word ‘95.

Not trying to be argumentative here; I appreciate the conversation and you’ve helped me sharpen my point, which I appreciate.