| ▲ | 2ndorderthought 3 hours ago | |||||||||||||||||||
1t model instances(opus, gpt,etc) are not running on a single GPU. The catch is how the cards communicate and how the model is broken up. There's a bit that goes into it but the answer is yes the more gpus the bigger the model you can run. | ||||||||||||||||||||
| ▲ | ryandrake 2 hours ago | parent [-] | |||||||||||||||||||
Really cool. I'm very much still learning about this stuff. Sounds like this inter-GPU communication is a feature of special hardware (not consumer GPUs). | ||||||||||||||||||||
| ||||||||||||||||||||