| ▲ | kgeist 5 hours ago | |||||||
We're planning to do the same thing - buy something like 8xH100 and run all coding there. The CTO almost agreed to find the budget for it but I need to make sure there are no risks before we buy (i.e. it's a viable/usable setup for professional AI-assisted coding) Can you share what models you run and find best performing for this setup? That would help a lot. I already run a smaller AI server in the office but only 32b models fit there. I already have experience optimizing inference, I'm just interested what models you think are great for 8xH100 for coding, I'll figure out the details how to fit it :) | ||||||||
| ▲ | dools 15 minutes ago | parent | next [-] | |||||||
Check out Verda you can rent whatever super powerful GPU clusters you need in 10 minute increments. Deploy any open weight model using SGLang and away you go | ||||||||
| ▲ | htrp 35 minutes ago | parent | prev | next [-] | |||||||
8 x h100 80's don't give you enough to run the latest 1tn + parameter models (especially at the context window lengths to be competitive with the frontier models) | ||||||||
| ||||||||
| ▲ | Havoc 2 hours ago | parent | prev [-] | |||||||
Deepseek, GLM, Minimax or Kimi are the most likely contenders. | ||||||||
| ||||||||