Remix.run Logo
refulgentis 2 days ago

It's been such a mind-boggling decline in intellect, combined with really odd and intense conspiratorial behavior around crypto, that I went into a bit a few months ago.

My weak, uncited, understanding from then they're poorly positioned, i.e in our set they're still the guys who write you a big check for software, but in the VC set they're a joke: i.e. they misunderstood carpet bombing investment as something that scales, and went all in on way too many crypto firm. Now, they have embarrassed themselves with a ton of assets that need to get marked down, it's clearly behind the other bigs, but there's no forcing function to do markdowns.

So we get primal screams about politics and LLM-generated articles about how a $9K video card is the perfect blend between price and performance.

There's other comments effusively praising them on their unique technical expertise. I maintain a llama.cpp client on every platform you can think of. Nothing in this article makes any sense. If you're training, you wouldn't do it on only 4 $9K GPUs that you own. If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

NitpickLawyer 2 days ago | parent | next [-]

> If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

I was with you up till here. Come on! CPU inferencing is not it, even macs struggle with bigger models, longer contexts (esp. visible when agentic stuff gets > 32k tokens).

The PRO6000 is the first gpu that actually makes sense to own from their "workstation" series.

refulgentis 2 days ago | parent [-]

Er, CPU inferencing? :) I didn't think I mentioned that!

The Framework Desktop thing is that has unified memory with the GPU, so much like an M-series, you can inference disproportionately large models.

CamperBob2 2 days ago | parent | prev [-]

If you're inferencing, you're not getting much more out of this than you would a ~$2K Framework desktop.

Well, you're getting the ability to maintain a context bigger than 8K or so, for one thing.

refulgentis 2 days ago | parent [-]

Well, no, at least, we're off by a factor of about 64x at the very least: 64 GB GPU M2 Max/M4 max top out at about 512K context for 20B params, and the Framework desktop I am referencing has 128 GB unified memory.

CamperBob2 2 days ago | parent [-]

What's the TTFT like on a GPU-poor rig, though, once you actually take advantage of large contexts?

refulgentis 2 days ago | parent [-]

I guess I'd say, why is the framework perceived as GPU poor? I don't have one but I also don't know why TTFT would be significantly lower than M-series (it's a good GPU!)

CamperBob2 2 days ago | parent [-]

Compared to 4x RTX 6000 Blackwell boards, it's GPU poor. There has to be a reason they want to load up a tower chassis with $35K worth of GPUs, right? I'd have to assume it has strong advantages for inference as well as training, given that the GPU has more influence on TTFT with longer contexts than the CPU does.

refulgentis 2 days ago | parent [-]

Right - I'd suggest the idea that 128 GB of GPU RAM gives you an 8K context shows us it may be worth revising priors such as "it has strong advantages for inference as well as training"

As Mr. Hildebrand used to say, when you assume, you make...

(also note the article specifically frames this speccing out as about training :) not just me suggesting it)