Remix.run Logo
CuriouslyC 5 days ago

My rust agent is closed source (at least right now, we'll see) but I'm happy to discuss details of how stuff works to get you going in the right direction.

all2 5 days ago | parent [-]

I'd be glad to hear more. I'm not certain what I would even ask, as the space is really fuzzy (prompting and all that).

I've got an Ollama instance (24GB VRAM) I want to leverage to try and reduce dependency on Claude Code. Even the tech stack seems unapproachable. I've considered LiteLLM, router agents, micro-agents (smallest slice of functionality possible), etc. I haven't wrapped my head around it all the way, though.

Ideally, it would be something like:

    UI <--> LiteLLM
               ^
               |
               v
            Agent Shim
Where the UI is probably aider or something similar. Claude Code muddies the differentiation between UI and agent (with all the built in system-prompt injection). I imagine I would like to move system-prompt injection / agent CRUD into the agent shim.

I'm just spitballing here.

Thoughts? (my email is in my profile if you would prefer to continue there)

CuriouslyC 5 days ago | parent [-]

I also have a 24gb card. Local LLMs are great for a lot of things but I wouldn't route coding questions to them, the time/$ tradeoff isn't worth it. Also, don't use LiteLLM, it's just bad, Bifrost is the way.

You can use a LLM router to direct questions to an optimal model on a price/performance pareto frontier. I have a plugin for Bifrost that does this, Heimdall (https://github.com/sibyllinesoft/heimdall), it's very beta right now but the test coverage is good, I just haven't paved the integration pathway yet.

I've got a number of products in the works to manage context automatically, enrich/tune rag, provide enhanced code search. Most of them are public and you can poke around and see what I'm doing. I plan on doing a number of launches soon but I like to build rock solid software and rapid agentic development really creates a large manual qa/acceptance eval burden.

all2 5 days ago | parent [-]

So there is no place for a local llm in code dev. Bummer. I was hoping to get past the 5 hour limits on claude code with local models.

CuriouslyC 5 days ago | parent [-]

Your best bet is the new Deepseek, it's claude code compatible, just use the anthropic url, they have instructions online.

all2 4 days ago | parent [-]

For the curious, here are the relevant docs: https://api-docs.deepseek.com/guides/anthropic_api