| ▲ | atleastoptimal 5 days ago |
| >No, LLMs will not get better. What makes you so sure of this? They've been getting better like clockwork every few months for the past 5 years. |
|
| ▲ | bigstrat2003 5 days ago | parent | next [-] |
| I don't claim that they won't get better, but they certainly haven't gotten better. From the original release of ChatGPT to now, they still suck in the same exact ways. |
| |
| ▲ | johnisgood 5 days ago | parent [-] | | I don't think they have gotten better either (at least in the past 1 year), because I remember how much better ChatGPT or even Claude used to be before. Perhaps they are nerfed now for commercial use, who knows. |
|
|
| ▲ | otabdeveloper4 5 days ago | parent | prev [-] |
| No they haven't. The hallucinate exactly as much as they did five years ago. |
| |
| ▲ | atleastoptimal 5 days ago | parent | next [-] | | Absolutely untrue. Claiming GPT-3 hallucinates as much as o3 over the same token horizon on the same prompts is a silly notion and easily disproven by the dozens of benchmarks. You can code a complete web-app with models now, something far beyond the means of models so long ago. | | |
| ▲ | otabdeveloper4 5 days ago | parent [-] | | > caveats and weasel words > "benchmarks" Stop drinking the coolaid and making excuses for LLM limitations, and learn to use the tools properly given their limits instead. |
| |
| ▲ | antihero 5 days ago | parent | prev [-] | | They really don’t though. | | |
| ▲ | otabdeveloper4 5 days ago | parent [-] | | Larger context lengths are awesome, but they don't fundamentally change the failure modes of LLMs. |
|
|