Remix.run Logo
causal 6 hours ago

Run an incredible 400B parameters on a handheld device.

0.6 t/s, wait 30 seconds to see what these billions of calculations get us:

"That is a profound observation, and you are absolutely right ..."

intrasight 5 hours ago | parent | next [-]

Better than waiting 7.5 million years to have a tell you the answer is 42.

bartread 4 hours ago | parent | next [-]

Looked at a certain way it's incredible that a 40-odd year old comedy sci-fi series is so accurate about the expected quality of (at least some) AI output.

Which makes it even funnier.

It makes me a little sad that Douglas Adams didn't live to see it.

patapong 3 hours ago | parent | next [-]

Also check out "The Great Automatic Grammatizator" by Roald Dahl for another eerily accurate scifi description of LLMs written in 1954:

https://gwern.net/doc/fiction/science-fiction/1953-dahl-theg...

zozbot234 3 hours ago | parent [-]

"Can write a prize-winning novel in fifteen minutes" - that's quite optimistic by modern standards!

staticman2 3 hours ago | parent | prev [-]

42 wasn't a low quality answer.

The joke revolves around the incongruity of "42" being precisely correct.

whyenot 4 hours ago | parent | prev | next [-]

Should have used a better platform. So long and thanks for all the fish.

5 hours ago | parent | prev | next [-]
[deleted]
AnonymousPlanet 4 hours ago | parent | prev | next [-]

Yes and then no one knows the prompt!

thinkingtoilet 5 hours ago | parent | prev | next [-]

Maybe you should have asked a better question. :P

patapong 5 hours ago | parent [-]

What do you get if you multiply six by nine?

ctxc 4 hours ago | parent | next [-]

Tea

GTP 3 hours ago | parent [-]

For two

RuslanL 4 hours ago | parent | prev | next [-]

67?

xeyownt 4 hours ago | parent | prev [-]

54?

ep103 4 hours ago | parent | prev [-]

Some one should let Douglas Adams know the calculation could have been so much faster if the machine just lied.

lesam 4 hours ago | parent [-]

I think Adams was prescient, since in his story the all powerful computer reaches the answer '42' via incorrect arithmetic.

xg15 4 hours ago | parent [-]

The Bistromathics? That's not incorrect, it's simply too advanced for us to understand.

aktau an hour ago | parent [-]

“What do you get if you multiply six by nine?”

(One) source: https://www.reddit.com/r/Fedora/comments/1mjudsm/comment/n7d...

xg15 an hour ago | parent [-]

Ok, my Hitchhiker-foo was too weak, thanks!

WarmWash 5 hours ago | parent | prev | next [-]

I don't think we are ever going to win this. The general population loves being glazed way too much.

baal80spam 5 hours ago | parent | next [-]

> The general population loves being glazed way too much.

This is 100% correct!

WarmWash 5 hours ago | parent [-]

Thanks for short warm blast of dopamine, no one else ever seems to grasp how smart I truly am!

timcobb 5 hours ago | parent [-]

That is an excellent observation.

otikik 4 hours ago | parent | prev | next [-]

The other day, I got:

"You are absolutely right to be confused"

That was the closest AI has been to calling me "dumb meatbag".

winwang 4 hours ago | parent | next [-]

It would be much worse if it had said "You are absolutely wrong to be confused", haha.

Terretta 4 hours ago | parent | prev [-]

"Carrot: The Musical" in the Carrot weather app, all about the AI and her developer meatbag, is on point.

tombert 5 hours ago | parent | prev | next [-]

That's an astute point, and you're right to point it out.

actusual 5 hours ago | parent [-]

You are thinking about this exactly the right way.

9dev 5 hours ago | parent | prev | next [-]

You’re absolutely right!

keybored 3 hours ago | parent | prev [-]

Poor “we”. “They” love looking at their own reflection too much.

Aurornis 5 hours ago | parent | prev | next [-]

I thought you were being sarcastic until I watched the video and saw those words slowly appear.

Emphasis on slowly.

r_lee 4 hours ago | parent | prev | next [-]

I too thought you were joking

laughed when it slowly began to type that out

vntok 4 hours ago | parent | prev | next [-]

2 years ago, LLMs failed at answering coherently. Last year, they failed at answering fast on optimized servers. Now, they're failing at answering fast on underpowered handheld devices... I can't wait to see what they'll be failing to do next year.

ezst 4 hours ago | parent | next [-]

Probably the one elephant in the roomy thing that matters: failing to say they don't know/can't answer

eru 4 hours ago | parent | next [-]

With tool use, it's actually quite doable!

post-it 4 hours ago | parent | prev [-]

Claude does it all the time, in my experience.

stavros 3 hours ago | parent [-]

Same here, it's even told me "I don't have much experience with this, you probably know better than me, want me to help with something else?".

BirAdam an hour ago | parent | prev [-]

The speed on a constrained device isn't entirely the point. Two years ago, LLMs failed at answering coherently. Now...

You're absolutely right. Now, LLMs are too slow to be useful on handheld devices, and the future of LLMs is brighter than ever.

LLMs can be useful, but quite often the responses are about as painful as LinkedIn posts. Will they get better? Maybe. Will they get worse? Maybe.

vntok 31 minutes ago | parent [-]

> Will they get better? Maybe. Will they get worse? Maybe.

I find it hard to understand your uncertainty; how could they not keep getting even better when we've been seeing qualitative improvements literally every second week for months on end? These improvements being eminently public and applied across multiple relevant dimensions: raw inference speed (https://github.com/ggml-org/llama.cpp/releases), external-facing capabilities (https://github.com/open-webui/open-webui/releases) and performance against established benchmarks (https://unsloth.ai/docs/models/qwen3.5/gguf-benchmarks)

amelius 5 hours ago | parent | prev [-]

I mean size says nothing, you could do it on a Pi Zero with sufficient storage attached.

So this post is like saying that yes an iPhone is Turing complete. Or at least not locked down so far that you're unable to do it.

zozbot234 4 hours ago | parent [-]

You need fast storage to make it worthwhile. PCIe x4 5.0 is a reasonable minimum. Or multiple PCIe x4 4.0 accessed in parallel, but this is challenging since the individual expert-layers are usually small. Intel Optane drives are worth experimenting with for the latter (they are stuck on PCIe 4.0) purely for their good random-read properties (quite aside from their wearout resistance, which opens up use for KV-cache and even activations).