| ▲ | sigmoid10 5 hours ago | |||||||||||||||||||||||||||||||||||||||||||||||||
Managing a McDonalds is a question of integration and modalities at this point. I don't think anyone still doubts that these models lack the reasoning capability or world knowledge needed for the job. So it's less of a fundamental technical problem and more of a process engineering issue. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | dap 37 minutes ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Have you not seen: https://www.anthropic.com/research/project-vend-1 https://www.wsj.com/tech/ai/anthropic-claude-ai-vending-mach... (Two different examples of a similar idea) | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | andy12_ 4 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
I disagree. Even frontier models still achieve way worse results than the human baseline in VendingBench. As long as models can't manage optimally something as simple as a vending machine, they have no hope of managing a McDonalds. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ▲ | throw-the-towel 5 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
The capability they lack is being able to be sued. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||