| ▲ | skydhash 4 hours ago | ||||||||||||||||
If a calculator gives me 5 when I do 2+2, I throw it away. If a PC crashes when I uses more than 20% of its soldered memory, i throw it away. If a mobile phone refuses to connect to a cellular tower, I get another one. What I want from my tools is reliability. Which is a spectrum, but LLMs are very much on the lower end. | |||||||||||||||||
| ▲ | tokioyoyo 3 hours ago | parent | next [-] | ||||||||||||||||
You can have this position, but the reality is that the industry is accepting it and moving forward. Whether you’ll embrace some of it and utilize it to improve your workflow, is up to you. But over-exaggerating the problem to this point is kinda funny. | |||||||||||||||||
| ▲ | crazygringo an hour ago | parent | prev | next [-] | ||||||||||||||||
Honestly, LLMs are about as reliable as the rest of my tools are. Just yesterday, AirDrop wouldn't work until I restarted my Mac. Google Drive wouldn't sync properly until I restarted it. And a bug in Screen Sharing file transfer used up 20 GB of RAM to transfer a 40 GB file, which used swap space so my hard drive ran out of space. My regular software breaks constantly. All the time. It's a rare day where everything works as it should. LLMs have certainly gotten to the point where they seem about as reliable as the rest of the tools I use. I've never seen it say 2+2=5. I'm not going to use it for complicated arithmetic, but that's not what it's for. I'm also not going to ask my calculator to write code for me. | |||||||||||||||||
| ▲ | candiddevmike an hour ago | parent | prev | next [-] | ||||||||||||||||
Sorry you're being downvoted even though you're 100% correct. There are use cases where the poor LLM reliability is as good or better than the alternatives (like search/summarization), but arguing over whether LLMs are reliable is silly. And if you need reliability (or even consistency, maybe) for your use case, LLMs are not the right tool. | |||||||||||||||||
| ▲ | fennecfoxy 3 hours ago | parent | prev | next [-] | ||||||||||||||||
Except it's more a case of "my phone won't teleport me to Hawaii sad faec lemme throw it out" than anything else. There are plenty of people manufacturing their expectations around the capabilities of LLMs inside their heads for some reason. Sure there's marketing; but for individuals susceptible to marketing without engaging some neurons and fact checking, there's already not much hope. Imagine refusing to drive a car in the 60s because they haven't reach 1kbhp yet. Ahaha. | |||||||||||||||||
| |||||||||||||||||
| ▲ | embedding-shape 3 hours ago | parent | prev [-] | ||||||||||||||||
> What I want from my tools is reliability. Which is a spectrum, but LLMs are very much on the lower end. "reliability" can mean multiple things though. LLM invocations are as reliable (granted you know how program properly) as any other software invocation, if you're seeing crashes you're doing something wrong. But what you're really talking about is "correctness" I think, in the actual text that's been responded with. And if you're expecting/waiting for that to be 100% "accurate" every time, then yeah, that's not a use case for LLMs, and I don't think anyone is arguing for jamming LLMs in there even today. Where the LLMs are useful, is where there is no 100% "right or wrong" answer, think summarization, categorization, tagging and so on. | |||||||||||||||||
| |||||||||||||||||