| ▲ | richwater 5 days ago |
| If you're okay with lower quality output, a $10k Mac Studio will get you there. But you _will_ have to accept lower quality outputs compared to todays' frontier models. |
|
| ▲ | OtherShrezzing 5 days ago | parent | next [-] |
| >But you _will_ have to accept lower quality outputs compared to todays' frontier models. I'm curious how much lower quality we're talking about here. Most of the work I ever get an LLM to do is glue-code, or trivial features. I'd expect some fine-tuned Codestral type model with well focused tasks could achieve good performance locally. I don't really need worlds-leading-expert quality models to code up a hamburger menu in a React app & set the background-color to #A1D1C1. |
|
| ▲ | gnator 5 days ago | parent | prev | next [-] |
| Has anyone tried running with a tenstorrent card? Wanted to see how they fare |
|
| ▲ | flashgordon 5 days ago | parent | prev [-] |
| Yeah I was actually thinking about a proper rig - My gut feel is a rig wouldnt be as expensve as a mac and would actually have a higher ROI (at the expense of portability)? My other worry about the mac is how unupgradable it is. Again not sure how fruitful it is - in my (probably fantasy land) view if I can setup a rig and then keep updating components as needed - it might last me a good 5 years say for 20k over that period? Or is that too hopeful? So for 20K over 5 years or 4k per year - it comes to about 400 a month (ish). The equivalent of 2 MAX pro subscriptions. Let us be honest - right now with these limits running more than 1 in parallel is going to be forbidden. if I can run 2 claude level models (assuming the DS and Qwens are there) then I am already breaking even but without having to participating in training with all my codebases (and I assume I can actually unlock something new in the process of being free). |
| |
| ▲ | lossolo 5 days ago | parent [-] | | Buy 4–8 used 3090s (providing 96–192 GB of VRAM), depending on the model and weight quantization you want to run. Used 3090 costs around $800. Add more RAM to offload layers if needed. This setup currently offers the best value for performance. https://www.reddit.com/r/LocalLLaMA/comments/1iqpzpk/8x_rtx_... You can look for more rig examples on that subreddit. | | |
| ▲ | esskay 5 days ago | parent | next [-] | | I do wonder what the ongoing cost there would be. The ~$9k hardware cost is an easy thing to quantify, but going with a bank of very hot, power hungry GPU's is going to rack up a hefty monthly bill in many parts of the world. I imagine theres also going to be some problems hooking something like that up to a normal wall socket in North America? (I like the reddit poster am in Europe so on 220v) | | |
| ▲ | icelancer 5 days ago | parent | next [-] | | It's not too bad - I run 6x RTX 3090s on a 2nd-gen Threadripper with PCIe bifurcation cards. The energy usage is only really bad if you're training models constantly, but inference is light enough. I use 208V power but 120V can indeed be a challenge. The USA has split phase wiring; every house has 220-240V if they need it. Bit of a misunderstanding of how our power works - we have 220-240V on tap, but typical outlets are 110-120V. | | |
| ▲ | flashgordon 5 days ago | parent [-] | | Yeah at this point the goal is to see how to maximize for inference. For training it is impossible from the get go to compete with the frontier labs anyway. Im trying to calculate (even amortized over 2 years) the daily cost of running the equivalent rig that can get close to a single claude agent performance. (without needing a 6-digit gpu). | | |
| ▲ | icelancer 4 days ago | parent [-] | | Really the only reason to have a local setup is for 24/7 on-demand high-volume inference that can't tolerate enormous cold starts. |
|
| |
| ▲ | flashgordon 5 days ago | parent | prev [-] | | Yeah this was what I was doubting too. Like the hardware is one off but how much do you have to modernize your house (lines, cooling, eletrical-fire-safety etc)? |
| |
| ▲ | flashgordon 5 days ago | parent | prev [-] | | Also I wonder if like the old days you could "try" these out somewhere first. Imaging plonking down 5-10k and nothing works (which is fine if you can get a refund ha). |
|
|