Remix.run Logo
kibwen 4 days ago

> I was astonished that half the internet is convinced that OpenAI is cheating.

If you have a problem and all of your potential solutions are unlikely, then it's fine to assume the least unlikely solution while acknowledging that it's statistically probable that you're also wrong. IOW if you have ten potential solutions to a problem and you estimate that the most likely solution has an 11% chance of being true, it's fine to assume that solution despite the fact that, by your own estimate, you have an 89% chance of being wrong.

The "OpenAI is secretly calling out to a chess engine" hypothesis always seemed unlikely to me (you'd think it would play much better, if so), but it seemed the easiest solution (Occam's razor) and I wouldn't have been surprised to learn it was true (it's not like OpenAI has a reputation of being trustworthy).

slibhb 4 days ago | parent | next [-]

I don't think it has anything to do with your logic here. Actually, people just like talking shit about OpenAI on HN. It gets you upvotes.

Legend2440 4 days ago | parent [-]

LLM cynicism exceeds LLM hype at this point.

bongodongobob 4 days ago | parent | prev | next [-]

That's not really how Occam's razor works. The entire company colluding and lying to the public isn't "easy". Easy is more along the lines of "for some reason it is good at chess but we're not sure why".

simonw 4 days ago | parent | next [-]

One of the reasons I thought that was unlikely was personal pride. OpenAI researchers are proud of the work that they do. Cheating by calling out to a chess engine is something they would be ashamed of.

kibwen 4 days ago | parent [-]

> OpenAI researchers are proud of the work that they do.

Well, the failed revolution from last year combined with the non-profit bait-and-switch pretty much conclusively proved that OpenAI researchers are in it for the money first and foremost, and pride has a dollar value.

fkyoureadthedoc 4 days ago | parent [-]

How much say do individual researchers even have in this move?

And how does that prove anything about their motivations "first and foremost"? They could be in it because they like the work itself, and secondary concerns like open or not don't matter to them. There's basically infinite interpretations of their motivations.

dogleash 4 days ago | parent | prev [-]

> The entire company colluding and lying to the public isn't "easy".

Why not? Stop calling it "the entire company colluding and lying" and start calling it a "messaging strategy among the people not prevented from speaking by NDA." That will pass a casual Occam's test that "lying" failed. But they both mean the same exact thing.

TeMPOraL 4 days ago | parent [-]

It won't, for the same reason - whenever you're proposing a conspiracy theory, you have to explain what stops every person involved from leaking the conspiracy, whether on purpose or by accident. This gets superlinearly harder with number of people involved, and extra hard when there are incentives rewarding leaks (and leaking OpenAI secrets has some strong potential rewards).

Occam's test applies to the full proposal, including the explanation of things outlined above.

og_kalu 4 days ago | parent | prev | next [-]

>but it seemed the easiest solution (Occam's razor)

In my opinion, it only seems like the easiest solution on the surface taking basically nothing into account. By the time you start looking at everything in context, it just seems bizarre.

kibwen 3 days ago | parent [-]

To reiterate, your assessment is true and we can assign it a low probability, but in the context of trying to explain why one model would be an outrageous outlier, manual intervention was the simplest solution out of all the other hypotheses, despite being admittedly bizarre. The thrust of the prior comment is precisely to caution against conflating relative and absolute likelihoods.

influx 4 days ago | parent | prev [-]

I wouldn't call delegating specialized problems to specialized engines cheating. While it should be documented, in a full AI system, I want the best answer regardless of the technology used.