▲ | systemerror 3 days ago | ||||||||||||||||||||||
The big issue with LLMs is that they’re usually right — like 90% of the time — but that last 10% is tough to fix. A 10% failure rate might sound small, but at scale, it's significant — especially when it includes false positives. You end up either having to live with some bad results, build something to automatically catch mistakes, or have a person double-check everything if you want to bring that error rate down. | |||||||||||||||||||||||
▲ | f3b5 3 days ago | parent | next [-] | ||||||||||||||||||||||
Depending on the use case, a 10% failure rate can be quite acceptable. This is of course for non-critical applications, like e.g. top-of-funnel sales automation. In practice, for simple uses like labeling data at scale, I'm actually reaching 95-99% accuracy in my startup. | |||||||||||||||||||||||
▲ | spogbiper 3 days ago | parent | prev [-] | ||||||||||||||||||||||
yes, the entire design relies on a human to check everything. basically it presents what it thinks should be done, and why. the human then agrees or does not. much work is put into streamlining this but ultimately its still human controlled | |||||||||||||||||||||||
|