▲ | ModernMech 6 days ago | |||||||||||||||||||||||||
We do have comparable tools to LLMs. There are plenty of human-composed tools that can do what LLMs do, like Mechanical Turk for instance. The human-composed tool that most closely resembles a LLM is the "bureaucracy". An LLM is like a little committee you send a request to, and based on capricious, opaque rules that can change at any time, the committee returns a response that may or may not service your request. You can't know ahead of time if it will, and depending on the time of day, the political environment, or the amount of work before the committee, the delay of servicing and the quality of service may or may not degrade. Quality may range from a quick accurate response, to flat refusal to service without explanation, or outright lies to your face. There's no way to guarantee a good result, and there's no recourse or explanation for why things go wrong or changed. LLMs feel like a customer service agent turned into a computer program, which is probably why it was people's first thought to use LMMs to automate customer service agents. They are a perfect fit there, but I don't want them to be my primary interface to do work. I have enough bureaucracies to deal with as it is. | ||||||||||||||||||||||||||
▲ | vidarh 6 days ago | parent [-] | |||||||||||||||||||||||||
> We do have comparable tools to LLMs. There are plenty of human-composed tools that can do what LLMs do, like Mechanical Turk for instance. If you are going to treat humans as tools, then sure. In which case measuring LLMs against human ability is exactly the right thing, given that with Mechanical Turk the tasks are carried out by humans - sometimes with the help of LLMs... It's utterly bizarre to argue over my comparing LLMs to humans when the tools you argue are comparable are humans. | ||||||||||||||||||||||||||
|