| ▲ | Aurornis 3 hours ago | |
The benchmarks are from the unquantized model they release. This will only run on server hardware, some workstation GPUs, or some 128GB unified memory systems. It’s a situation where if you have to ask, you can’t run the exact model they released. You have to wait for quantizations to smaller sizes, which come in a lot of varieties and have quality tradeoffs. | ||
| ▲ | 2 hours ago | parent | next [-] | |
| [deleted] | ||
| ▲ | bityard an hour ago | parent | prev [-] | |
This would likely run fine in just 96 GB of VRAM, by my estimation. Well within the ability of an enthusiastic hobbyist with a few thousand dollars of disposable income. Quantizations are already out: https://huggingface.co/unsloth/Qwen3.6-27B-GGUF | ||