> But frontier models have become really good, and running vending machines is too easy for them now.

Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.

▲

ivanovm 5 hours ago | parent | next [-]

You could just look it up on their website leaderboard? The newest Claude model makes over $10k profit over a simulated year of operation, after starting with $500

▲

jeffreyrogers 5 hours ago | parent | next [-]

They've never translated it to the real world though. So saying the problem is "too easy" when they have no public (as far as I know) demonstration that they've solved that problem is a stretch.

▲

ivanovm 5 hours ago | parent [-]

Yes, they did. You could also find this information easily. A company like Andon creates value by exposing interesting AI failure modes, so it makes perfect sense for them to move on to harder problems when the previous ones get saturated. I think you're just being overly cynical.

	▲	jeffreyrogers 4 hours ago \| parent [-]
		Can you point me to an example then? It's not linked in the article as far as I can tell and it's not easy to find on their website if it's there. I don't count simulations because I used to work with simulations regularly and they often fail to translate to the real world.

▲

Tallain an hour ago | parent | prev | next [-]

Since when is a simulation equal to real world performance?

▲

pocksuppet 5 hours ago | parent | prev [-]

So in other words, no, an LLM has never made profit.

▲

delusional 5 hours ago | parent | prev | next [-]

> Wasn't their previous attempt at running vending machines unprofitable?

If we are talking about the one at that newspaper, it wasnt just unprofitable. The "customers" made it give away products for free. It was ordering them playstations.

As entertainment it was fun, but as a business or proof of intelligence or Turing test, it was an abject failure.

▲

yieldcrv 3 hours ago | parent | prev | next [-]

Anything you read thats more than 3 months old in this field is obsolete

And one person’s attempt doesn’t mean anything

According to Linkedin articles, agentic workflows dont work, mine have been running for a year for several organizations I’ve worked for. Prompting used to be much more particular and now its not the issue

▲

Chaosvex 3 hours ago | parent [-]

> Anything you read thats more than 3 months old in this field is obsolete

Sigh. I'll see you in another three months when you say the same again.

	▲	yieldcrv 2 hours ago \| parent [-]
		I set an alarm to re-evaluate all of my workflows to avoid complacency, see you in July 3 months ago I was still building webapps, I’m definitely on the “paying to summarize info on a screen is obsolete” bandwagon now. All my products just have an AI calling or messaging customers about what the AI did, event driven architectures triggered by something hitting an email inbox, or in the real world, or other API. You dont need an app for your fitness tracker, just have an AI person tell you what you’re doing right and wrong once a week, send you food and medicine and tell you why. Solve the underlying problem like all the old depictions of the 21st portrayed aligned robots doing, apps were a distraction. Very curious where I’m at with this in July

▲

palmotea 5 hours ago | parent | prev | next [-]

> Wasn't their previous attempt at running vending machines unprofitable? Not aware of any demonstration that it can actually run that business successfully.

It doesn't look like this one will be any better. Did you look at the merchandise selection? It's only chance is pity purchases from AI bros.

▲

AndrewKemendo 5 hours ago | parent | prev [-]

[flagged]