| ▲ | segmondy 4 days ago |
| Yes, and plenty of others do too. Quantizied. Join us at r/localllama My largest models 318G /llmzoo/models/Qwen3.5-397B
377G DeepSeekv3.2-nolight
380G /llmzoo/models/DeepSeek-V3.2-UD
400G /llmzoo/models/Qwen3.5-397B-Q8
443G DeepSeek-Math-v2
443G DeepSeek-V3-0324-Q5
522G /llmzoo/models/GLM5.1
545G /llmzoo/models/kimi2.6
546G /llmzoo/models/KimiK2.5
|
|
| ▲ | danilocesar 3 days ago | parent | next [-] |
| Is your house's heating system based on H100s? |
| |
|
| ▲ | Liftyee 4 days ago | parent | prev | next [-] |
| What hardware do you use? |
| |
| ▲ | Terretta 20 hours ago | parent | next [-] | | Most of those have custom quants for Mac Studio M3 Ultra 512GB. You'll typically see them mention it by name. All of that list but the last three run at these sizes. For last three, look for a custom quant, e.g. 9.5 bits and/or the Ultra M3 512GB mention. Not sure which direction I'm surprised but Macbook Pro M5 Max ticks over models at the same speed. With "only" 128GB look for models of 116 GB (the absolute max that retains reasonable stability) or less. | |
| ▲ | MezzoDelCammin 3 days ago | parent | prev | next [-] | | I think the answer to this is:"yes" | |
| ▲ | CoolThings 3 days ago | parent | prev | next [-] | | a Beowulf cluster of 256 x Raspberry Pi 3. | | |
| ▲ | hhh 2 days ago | parent [-] | | I used to maintain a 2000 pi 4 cluster, before LLMs were relevant, with around 6gb free ram per node. I wonder what I could have done with something like this. |
| |
| ▲ | tclancy 3 days ago | parent | prev [-] | | All of it. |
|
|
| ▲ | chid 3 days ago | parent | prev [-] |
| even quantised, those are HUGE |