| ▲ | alasdair_ 11 hours ago |
| I just don’t believe that this can run inference on a 120 billion parameter model at actually useful speeds. Obviously any Turing machine can run any size of model, so the “120B” claim doesn’t mean much - what actually matters is speed and I just don’t believe this can be speedy enough on models that my $5000 5090-based pc is too slow for and lacks enough vram for. |
|
| ▲ | mnkyprskbd 11 hours ago | parent [-] |
| Look at the GPU and RAM spec; 120b seems workable. |
| |
| ▲ | Aurornis 11 hours ago | parent [-] | | For the red v2? 120B could run, but I wouldn't want to be the person who had to use it for anything. To be fair, the 120B claim doesn't appear on the webpage. I don't know where it came from, other than the person who submitted this to HN | | |
| ▲ | mnkyprskbd 11 hours ago | parent [-] | | It is more than fair, also, you're comparing your 5k devices to 12k and more importantly 65k and >10m devices. | | |
| ▲ | Aurornis 10 hours ago | parent [-] | | The "to be fair" part of my comment was saying that the tinygrad website doesn't claim 120B. Also nobody is comparing this box to an $10M nVidia rack scale deployment. They're comparing it to putting all of the same parts into their Newegg basket and putting it together themself. |
|
|
|