You run a 671B model at home?

▲ segmondy 2 days ago | parent | next [-]

Yes, and plenty of others do too. Quantizied. Join us at r/localllama

My largest models

   318G    /llmzoo/models/Qwen3.5-397B
   377G    DeepSeekv3.2-nolight
   380G    /llmzoo/models/DeepSeek-V3.2-UD
   400G    /llmzoo/models/Qwen3.5-397B-Q8
   443G    DeepSeek-Math-v2
   443G    DeepSeek-V3-0324-Q5
   522G    /llmzoo/models/GLM5.1
   545G    /llmzoo/models/kimi2.6
   546G    /llmzoo/models/KimiK2.5

▲

danilocesar 2 days ago | parent | next [-]

Is your house's heating system based on H100s?

	▲	5 hours ago \| parent [-]
		[deleted]

▲

Liftyee 2 days ago | parent | prev | next [-]

What hardware do you use?

▲

MezzoDelCammin a day ago | parent | next [-]

I think the answer to this is:"yes"

▲

CoolThings a day ago | parent | prev | next [-]

a Beowulf cluster of 256 x Raspberry Pi 3.

	▲	hhh an hour ago \| parent [-]
		I used to maintain a 2000 pi 4 cluster, before LLMs were relevant, with around 6gb free ram per node. I wonder what I could have done with something like this.

▲

tclancy a day ago | parent | prev [-]

All of it.

▲

chid a day ago | parent | prev [-]

even quantised, those are HUGE

▲ tclancy 2 days ago | parent | prev | next [-]

It's a big house.

▲ UncleOxidant a day ago | parent | prev | next [-]

Maybe if there was a 1-bit quant.

	▲	barbacoa 19 hours ago \| parent [-]
		Apple briefly was selling Mac studio with 512 GB of unified ram, meaning all that was available as vram.

▲ 2 days ago | parent | prev [-]

[deleted]