2 years ago, LLMs failed at answering coherently. Last year, they failed at answering fast on optimized servers. Now, they're failing at answering fast on underpowered handheld devices... I can't wait to see what they'll be failing to do next year.

▲

ezst 4 hours ago | parent | next [-]

Probably the one elephant in the roomy thing that matters: failing to say they don't know/can't answer

▲

eru 4 hours ago | parent | next [-]

With tool use, it's actually quite doable!

▲

post-it 4 hours ago | parent | prev [-]

Claude does it all the time, in my experience.

	▲	stavros 3 hours ago \| parent [-]
		Same here, it's even told me "I don't have much experience with this, you probably know better than me, want me to help with something else?".

▲

BirAdam an hour ago | parent | prev [-]

The speed on a constrained device isn't entirely the point. Two years ago, LLMs failed at answering coherently. Now...

You're absolutely right. Now, LLMs are too slow to be useful on handheld devices, and the future of LLMs is brighter than ever.

LLMs can be useful, but quite often the responses are about as painful as LinkedIn posts. Will they get better? Maybe. Will they get worse? Maybe.

	▲	vntok 36 minutes ago \| parent [-]
		> Will they get better? Maybe. Will they get worse? Maybe. I find it hard to understand your uncertainty; how could they not keep getting even better when we've been seeing qualitative improvements literally every second week for months on end? These improvements being eminently public and applied across multiple relevant dimensions: raw inference speed (https://github.com/ggml-org/llama.cpp/releases), external-facing capabilities (https://github.com/open-webui/open-webui/releases) and performance against established benchmarks (https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks)