| ▲ | CamperBob2 2 days ago |
| If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop. Well, you're getting the ability to maintain a context bigger than 8K or so, for one thing. |
|
| ▲ | refulgentis 2 days ago | parent [-] |
| Well, no, at least, we're off by a factor of about 64x at the very least: 64 GB GPU M2 Max/M4 max top out at about 512K context for 20B params, and the Framework desktop I am referencing has 128 GB unified memory. |
| |
| ▲ | CamperBob2 2 days ago | parent [-] | | What's the TTFT like on a GPU-poor rig, though, once you actually take advantage of large contexts? | | |
| ▲ | refulgentis 2 days ago | parent [-] | | I guess I'd say, why is the framework perceived as GPU poor? I don't have one but I also don't know why TTFT would be significantly lower than M-series (it's a good GPU!) | | |
| ▲ | CamperBob2 2 days ago | parent [-] | | Compared to 4x RTX 6000 Blackwell boards, it's GPU poor. There has to be a reason they want to load up a tower chassis with $35K worth of GPUs, right? I'd have to assume it has strong advantages for inference as well as training, given that the GPU has more influence on TTFT with longer contexts than the CPU does. | | |
| ▲ | refulgentis 2 days ago | parent [-] | | Right - I'd suggest the idea that 128 GB of GPU RAM gives you an 8K context shows us it may be worth revising priors such as "it has strong advantages for inference as well as training" As Mr. Hildebrand used to say, when you assume, you make... (also note the article specifically frames this speccing out as about training :) not just me suggesting it) |
|
|
|
|