▲ | A GPU Calculator That Helps Calculate What GPU to Use(calculator.inference.ai) | |||||||||||||||||||||||||||||||||||||||||||||||||
107 points by chlobunnee 3 days ago | 33 comments | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | zargon 3 days ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
The best VRAM calculator I have found is https://apxml.com/tools/vram-calculator. It is much more thorough than this one. For example, it understands different models' attention schemes for correct KV cache size calculation, and supports quantization of both the model and the KV cache. Also, fine-tuning. It has its own limitations, such as only supporting specific models. In practice though, the generic calculators are not very useful because model architectures vary (mainly the KV cache) and end up being way off. (Not sure whether or not it would be better to discuss it separately, but I submitted it at https://news.ycombinator.com/item?id=44677409) | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | kouteiheika 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
The training memory breakdown is wildly inaccurate. - No one trains big models in FP32 anymore. - Gradients can also often be in BF16, and they don't actually have to be stored if you're not using gradient accumulation or if you're accumulating them directly in the optimizer's state. - 32-bit Adam is silly; if you don't have infinite VRAM there's no reason why you wouldn't want to use 8-bit Adam (or you can go even lower with quantized Muon) - Activations? They take up memory too, but are not mentioned. It shows that to train a 3.77B parameter model I need 62GB of VRAM; just to give you some perspective for how overestimated this is: a few weeks back I was training (full fine-tuning, not LoRA) a 14B parameter model on 24GB of VRAM using every trick in the book to lower VRAM usage (to be fair, not all of those tricks are available in publicly available training harnesses, but the point still stands that even with an off-the-shelf training harness you can do a lot better than what this calculator suggests). | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | funfunfunction 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
This is a cheap marketing ploy for a GPU reseller with billboards on highway 101 into SF. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | mdaniel 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
> 0 Model Available Who in the world is expected to populate 11 select/text fields with their favorite model data points they just happen to have lying around, only to see an absolutely meaningless "295% Inference" outcome What a dumpster | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | LorenDB 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Where's AMD support? I have a 9070 XT and would love to see it listed on here. | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | snvzz 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Rather than GPU calculator, this is an NVIDIA calculator. | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | amanzi 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
I would have liked to see the RTX 5060 Ti with 16GB mentioned. I can't tell if it's omitted because it won't work, or if it's excluded for some other reason? | ||||||||||||||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | nottorp 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
What GPU to use for what? Witcher 4? Death Stranding? | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | chlobunnee 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
I built a calculator to help researchers and engineers pick the right GPUs for training and inference workloads! It helps compare GPU options by taking in simple parameters (# of transformer layers, token size, etc) and letting users know which GPUs are compatible + their efficiency for training vs inferencing. The idea came from talking with ML researchers frustrated by slow cluster queues or wasting money on overkill GPUs. I'd love feedback on what you feel is missing/confusing! Some things I'm thinking about incorporating next are >Allowing users to directly compare 2 GPUs and their specs >Allowing users to see whether a fraction of the GPU can complete their workload I would really appreciate your thoughts/feedback! Thanks! | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | daft_pink 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
It would be really nice if you could import the standard models so we could see what kind of gpu we would need for popular models in the news and on hugging face | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | amelius 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
I selected LLama 3 70B, and then it said all the GPUs are insufficient for training :( | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | amstan 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
You're missing any AMD stuff, I can run a quantized deepseek r1 671B on 4 framework desktops, yet it's "insufficient" for 10 Nvidia gpus. | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | timothyduong 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Where's 3090? Or should that fall in the 4090 (24GB VRAM) category? | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
[deleted] | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | jjmarr 2 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
AMD support? | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | quotemstr 3 days ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
No sharding? At all? | ||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | alkonaut 2 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||||||||||||||
Save you a click: it’s about AI. |