Remix.run Logo
colechristensen 2 days ago

An NVIDIA DGX Spark is $4000, pair that with a relatively cheap second box to run GitLab in the corner and you would have pretty good local AI inference setup. (you'd probably have to write a nontrivial amount of software to get your setup where you want)

The local models are just right on the edge of being really useful, there's a tipping point to where accuracy is high enough so that getting things done is easy vs models getting continuously stuck. We're in the neighborhood.

Alternatively, just have local GitLab and use one of the many APIs, those are much more stable than github. Honestly just get yourself a Claude subscription.

smcleod 2 days ago | parent | next [-]

The DGX Spark is not good for inference though it's very bandwidth limited - around the same as a lower end MacBook Pro. You're much better off with a Apple silicon for performance and memory size at the moment but I'd recommend holding off until the M5 Max comes out early in the early as the M5 has vastly superior performance to any other Apple silicon chip thanks to its matmul instruction set.

llbbdd 2 days ago | parent [-]

Oof, I was already considering an upgrade from the M1 but was hoping I couldn't be convinced to go for the top of the line. Is the performance jump from the M# -> M# Max chips that substantial?

smcleod a day ago | parent | next [-]

The main jump is from anything to M5; not because it's simply the latest but because it has matmul instructions similar to a CUDA GPU which fixes the slow prompt processing on all previous generation Apple Silicon chips.

baby_souffle a day ago | parent | prev [-]

> Is the performance jump from the M# -> M# Max chips that substantial

From m1? Yes, absolutely. M3 is marginal now but m5 will probably make it definite.

llbbdd 2 days ago | parent | prev [-]

I can't say I'm not tempted looking at the Spark, I could probably save some cash on heating my house with that thing. Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.

Adding Claude to my rotation is starting to look like the option with the least amount of building the universe from scratch. I have to imagine it can be used in a similar or identical workflow to the Copilot one where it can create PRs and make adjustments in response to feedback etc.

colechristensen 2 days ago | parent [-]

>Though yeah unless there's some good software already built around a similar LLM workflow I could use it'd probably be wasted on me, or spend its time desperately trying to pay for itself with crypto mining.

A big part of my success using LLMs to build software is building the tools to use LLMs and the LLMs making that tool building easy (and possible).

llbbdd 2 days ago | parent [-]

I tried this for a little while and couldn't really get passionate about it; I have too many other backlogged projects that I was eager to tear into with LLMs and I got impatient. That was a while ago though and the ROI for building my own tools has probably gotten a lot more attractive.

colechristensen 2 days ago | parent [-]

I started building my own tool set because I was doing too many projects with LLMs and getting frustrated by a very real need for organization and tooling to get repetitive meaningless tasks out of the way and to get all of my projects organized so I could see what was going on.

llbbdd 2 days ago | parent [-]

I'm convinced. :) I've got some time to kill in transit later today, maybe time to think about my setup a bit.