Remix.run Logo
freediddy 2 hours ago

In the last year, I have bought an M3 Ultra Mac Studio with 512 GB, a Macbook Pro M5 MAX with 128 GB and an RTX 6000 Pro. I have spent around $25k so far, not including electricity. I figured worst case scenario I can sell them in the next year and only take a haircut as opposed to losing my entire investment.

In comparison to just spending for tokens, the tokens would have been much cheaper and much much faster. I've been running against Gemma4:31b, Qwen3.5 and 3.6, and getting local LLMs to solve AMC 8/10 math questions and it's about 10-100x slower than just doing it online. When I tried it with ChatGPT late last year, it took about one night and $25 to solve about 1000 questions. Using my RTX 6000 and M3 Ultra and Gemma4:31b on both, it answered about 40 questions in 7 hours and I haven't checked how good the answer is yet. At 800 watts (600 for RTX and 200 for M3 Ultra) and running for 7 hours, it solved around 40 questions.

At the very least I'm going to try to sell my M3 Ultra if I can find a reliable place to sell it without getting ripped off by scammers.

jon-wood 2 hours ago | parent | next [-]

I’m not usually one to ask this because learning to do a thing can be fun, but why exactly have you spent 25 thousand dollars on getting an LLM someone else made to answer maths exam questions?

nickthegreek 2 hours ago | parent | next [-]

The cost is obviously not that big of factor for OP as it might be for others. It's actually refreshing to hear the candid viewpoint that he expresses here.

freediddy an hour ago | parent [-]

25k is definitely a lot but I did the risk analysis and I figured worst case I would lose a 1000-2000 after a year of playing around with it, so I look at it more like renting (I'm going to keep the Macbook Pro no matter what since I needed a new one).

cronin101 an hour ago | parent [-]

Nitpicking, but the worst case of spending $25k is unforeseen circumstances that write off the entire asset. I don’t think -$2000 is a conservative enough figure for standard depreciation either (a lot can happen in a year)

hnuser123456 2 hours ago | parent | prev | next [-]

Privacy and offline operation are valuable or non-negotiable in some cases, but the difference is pretty categorical between what can run on a single card and what can run on a DGX GB200 NVL72 cabinet. Doesn't mean it's not worth seeing how far local models can be pushed. Not every problem needs a senior engineer.

freediddy 2 hours ago | parent | prev | next [-]

It's just a project I'm working on. I'm working on projects where AIs are processing and classifying large amounts of data that would be a lot of work for humans to do.

wutwutwat an hour ago | parent [-]

I think of LLMs as being well equipped for handling dynamic data or adapting to unforeseen circumstances well (random code requests, website's ever changing layouts, typos, non-standard formatting in docs, groking out important info, etc), but math problems are be definition a very specific set of instructions to run, so is the overhead and "thinking" aspect of a LLM/AI even needed here? I'm genuinely curious, btw, I'm not asking sarcastically. Can't these math problems just be yanked from some test file and rapid fired directly at a gpu/compute unit?

freediddy an hour ago | parent [-]

> Can't these math problems just be yanked from some test file and rapid fired directly at a gpu/compute unit?

Yes this is exactly what I'm doing. I isolated the actual math question, and then sent it to my two servers to process and that's what's taking 10m+ to return. I'm asking them to solve the question and return the full answer along with their steps. I care about correctness so taking time is okay but I can't use 10m per solution.

Retric 2 hours ago | parent | prev [-]

That hardware is costing him ~1$/hour over 3 years. Presumably having it answer math questions was a tiny fraction of what he was using it for.

LarsDu88 41 minutes ago | parent | prev | next [-]

Well if it makes you feel better those frontier LLMs are all technically taking a big loss, and they may all be in your shoes after a few years.

plasticsoprano an hour ago | parent | prev | next [-]

You'll probably make a profit by selling them today. I bought a M1 Max Studio with 64 GB last year off FB Marketplace for $1000 and today I'm seeing numerous 32 GB M1 Maxes for $1200-1500.

freediddy an hour ago | parent [-]

Yes the prices on eBay for the Mac Studio are all over the place, but I've seen sales for over $20k. I don't know if I believe it but there's enough to make me think if I can sell it for that price it would be worth it, but eBay has basically no seller protection so I'm not willing to take that chance.

arjie 2 hours ago | parent | prev | next [-]

All of these have appreciated in value. How much are you looking for the Ultra?

freediddy an hour ago | parent [-]

I've seen a lot of sales on eBay for over $20k, but I don't know if I believe it. Plus the lack of seller protection and the prevalence of scams on eBay make me too hesitant to actually want to risk it so I don't know what to do haha

arjie an hour ago | parent [-]

Haha, yeah, it's about $23k or so. Should be twice the price what you bought it for if you got it last year. Tbh I don't know why. The RAM is large but the bandwidth and the compute isn't nearly enough. You can fit DeepSeek V3 on it quantized but inference is like 10 tok/s. Honestly, you'll be able to sell it locally for that in cash, and I would in your place.

I saw your heat comments about the RTX 6000 Pro as well. I bought a few of them recently and I'm running 2 of them in a 2U case in a colo. You need a lot of active airflow to keep them cool. Mine range from 23 C to 80 C.

bethekind 2 hours ago | parent | prev | next [-]

Which of these has been the most productive for you? Sounds like you've enjoyed the RTX6000 the most?

freediddy an hour ago | parent [-]

RTX 6000 is some-what obviously my fastest card but my biggest problem with the RT 6000 is the immense heat. The GPU itself is almost 200F and the exhaust from the fans itself is over 150F. I'm worried that my hard drives are going to fail. I was told that the GDDR7 is even hotter than the GPU which is surprising to me.

After my last run, I'm going to wait for the new case I ordered to come in and cannibalize my kid's PC that we built beginning of this year to form an entirely separate computer. And then figure out better ways to deal with the heat, especially with summer coming up. I'll have to play around with undervolting and running vents directly outside my house to see if that helps.

vladgur an hour ago | parent | next [-]

From my failed and expensive affair with GPU mining 5 years ago, You can get a great heat dissipation outcome by using an open case with a lot of directed fans at the expense of a bit of dust and lots of noise

ericd 39 minutes ago | parent | prev [-]

I take it this wasn't the half-wattage Max Q version with blower fan?

iooi an hour ago | parent | prev | next [-]

I'll buy your macbook if you're trying to get rid of it!

freediddy 28 minutes ago | parent [-]

I'm keeping that one for sure, I love it!

jmyeet an hour ago | parent | prev | next [-]

I looked into the M3 Ultra 512GB Mac Studio before it was discontinued and the as best as I could determine it just wasn't worth it... yet. The GFLOPS and memory bandwidth just arne't there even though it can hold a much larger model in memory.

But the trend here is interesting. I think by 2030 you'll be able to buy fairly cheap hardware that is currently $10k+. I don't know what this does to the trillions invested in AI data centers because the next NVidia architecture after Blackwell will essentially half the value of purchased cards overnight.

I'm not convinced Apple has yet pivoted the Mac Studio line towards this market and the expected M5 Ultras in Q3 2026 will likely be an incremental improvement rather than big leap forward but I'd like to be proven wrong.

freediddy an hour ago | parent [-]

I agree that all these datacenter companies like Coreweave are investing billions in technology that has a very fast depreciation curve and I don't know how they will sustain income. The same goes for datacenters in space, what happens when those chips are obsolete? Will they sent astronauts to replace them or will they let them burn up and send new ones into orbit every year?

I feel that the open weight models pale in comparison to the frontier models, and I believe that if the gap closes quickly, that the open weight vendors will stop releasing it for free.

CamperBob2 2 hours ago | parent | prev | next [-]

How do you use the RTX 6000 with the Macs? Exo? I would think that would be pretty snappy if configured properly.

freediddy 2 hours ago | parent [-]

This is on a separate Windows PC, I don't have it integrated with the Macs.

wslh an hour ago | parent | prev [-]

[flagged]