| ▲ | oceanplexian a day ago | |||||||
I run models with Claude Code (Using the Anthropic API feature of llama.cpp) on my own hardware and it works every bit as well as Claude worked literally 12 months ago. If you don't believe me and don't want to mess around with used server hardware you can walk into an Apple Store today, pick up a Mac Studio and do it yourself. | ||||||||
| ▲ | Eggpants a day ago | parent | next [-] | |||||||
I’ve been doing the same with GPT-OSS-120B and have been impressed. Only gotcha is Claude code expects a 200k context window while that model max supports 130k or so. I have to do a /compress when it gets close. I’ll have to see if there is a way to set the max context window in CC. Been pretty happy with the results so far as long as I keep the tasks small and self contained. | ||||||||
| ||||||||
| ▲ | icedchai a day ago | parent | prev [-] | |||||||
Whats your preferred local model? | ||||||||