▲ | CuriouslyC 5 days ago | ||||||||||||||||
I also have a 24gb card. Local LLMs are great for a lot of things but I wouldn't route coding questions to them, the time/$ tradeoff isn't worth it. Also, don't use LiteLLM, it's just bad, Bifrost is the way. You can use a LLM router to direct questions to an optimal model on a price/performance pareto frontier. I have a plugin for Bifrost that does this, Heimdall (https://github.com/sibyllinesoft/heimdall), it's very beta right now but the test coverage is good, I just haven't paved the integration pathway yet. I've got a number of products in the works to manage context automatically, enrich/tune rag, provide enhanced code search. Most of them are public and you can poke around and see what I'm doing. I plan on doing a number of launches soon but I like to build rock solid software and rapid agentic development really creates a large manual qa/acceptance eval burden. | |||||||||||||||||
▲ | all2 5 days ago | parent [-] | ||||||||||||||||
So there is no place for a local llm in code dev. Bummer. I was hoping to get past the 5 hour limits on claude code with local models. | |||||||||||||||||
|