| ▲ | MuffinFlavored 7 hours ago | |||||||
> Running DeepSeek V3 (685B) requires 8×H100 GPUs which is about $14k/month. Most developers only need 15-25 tok/s. > deepseek-v3.2-685b, $40/mo/slot for ~20 tok/s, 465 slots total > 465 users × 20 tok/s = 9,300 tok/s needed > The node peaks at ~3,000 tok/s total. So at full capacity they can really only serve: > 3,000 ÷ 20 = 150 concurrent users at 20 tok/s > That's only 32% of the cohort being active simultaneously. | ||||||||
| ▲ | artificialprint 7 hours ago | parent [-] | |||||||
People work 8 hours a day presumably, I guess they are banking on this idea | ||||||||
| ||||||||