Remix.run Logo
lowbloodsugar 2 days ago

You run a 671B model at home?

segmondy 2 days ago | parent | next [-]

Yes, and plenty of others do too. Quantizied. Join us at r/localllama

My largest models

   318G    /llmzoo/models/Qwen3.5-397B
   377G    DeepSeekv3.2-nolight
   380G    /llmzoo/models/DeepSeek-V3.2-UD
   400G    /llmzoo/models/Qwen3.5-397B-Q8
   443G    DeepSeek-Math-v2
   443G    DeepSeek-V3-0324-Q5
   522G    /llmzoo/models/GLM5.1
   545G    /llmzoo/models/kimi2.6
   546G    /llmzoo/models/KimiK2.5
danilocesar 2 days ago | parent | next [-]

Is your house's heating system based on H100s?

5 hours ago | parent [-]
[deleted]
Liftyee 2 days ago | parent | prev | next [-]

What hardware do you use?

MezzoDelCammin a day ago | parent | next [-]

I think the answer to this is:"yes"

CoolThings a day ago | parent | prev | next [-]

a Beowulf cluster of 256 x Raspberry Pi 3.

hhh an hour ago | parent [-]

I used to maintain a 2000 pi 4 cluster, before LLMs were relevant, with around 6gb free ram per node. I wonder what I could have done with something like this.

tclancy a day ago | parent | prev [-]

All of it.

chid a day ago | parent | prev [-]

even quantised, those are HUGE

tclancy 2 days ago | parent | prev | next [-]

It's a big house.

UncleOxidant a day ago | parent | prev | next [-]

Maybe if there was a 1-bit quant.

barbacoa 19 hours ago | parent [-]

Apple briefly was selling Mac studio with 512 GB of unified ram, meaning all that was available as vram.

2 days ago | parent | prev [-]
[deleted]