|
| ▲ | testaburger 9 days ago | parent | next [-] |
| Which specific model epcys? And if it's not too much to ask which motherboard and power supply? I'm really interested in building something similar |
| |
|
| ▲ | fouc 8 days ago | parent | prev | next [-] |
| I've seen some mentions of pure-cpu setups being successful for large models using old epyc/xeon workstations off ebay with 40+ cpus. Interesting approach! |
|
| ▲ | wkat4242 9 days ago | parent | prev | next [-] |
| Wow nice!! That's a really good deal for that much hardware. How many tokens/s do you get for DeepSeek-R1? |
| |
| ▲ | DrPhish 8 days ago | parent [-] | | Thanks, it was a bit of a gamble at the time (lots of dodgy ebay parts), but it paid off. R1 starts at about 10t/s on an empty context but quickly falls off. I'd say the majority of my tokens are generating around 6t/s. Some of the other big MoE models can be quite a bit faster. I'm mostly using QwenCoder 480b at Q8 these days for 9t/s average. I've found I get better real-world results out of it than K2, R1 or GLM4.5. |
|
|
| ▲ | ekianjo 9 days ago | parent | prev [-] |
| thats a r/localllama user right there |