Remix.run Logo
jimbo808 5 days ago

I'd like to give these a try - what's your way of using them? I mostly use Claude because of Claude Code. Not sure what agentic coding tools people are using these days with OSS models. I'm not a big fan of manually uploading files into a web UI.

reissbaker 5 days ago | parent | next [-]

The most private way is to use them on your own machine; a Mac Studio maxed out to 512GB RAM can run GLM-4.5 at FP8 with fairly long context, for example.

If you don't have the hardware to run it locally, let me shill my own company for a minute: Synthetic [1] has a $20/month subscription to most of the good open-weight coding LLMs, with higher rate limits than Claude's $20/month sub. And our $60/month sub has higher rate limits than the $200/month maxed-out version of the Claude Max plan.

You can still use Claude Code by using LiteLLM or similar tools that convert Anthropic-style API requests to OpenAI-style API requests; once you have one of those running locally, you override the ANTHROPIC_BASE_URL env var to point to your locally-running proxy. We'll also be shipping an Anthropic-compatible API this week to work with Claude Code directly. Some other good agentic tools you could use instead include Cline, Roo Code, KiloCode, OpenCode, or Octofriend (the last of which we maintain).

1: https://synthetic.new

sheepscreek 5 days ago | parent | next [-]

Very impressed with what you're doing. It's not immediately clear how the prompts and the data is used on the site. Your terms mention a 14 day API retention, but it's not clear if that applies to Octo/the CLI agent and any other forms of subscription usage (not through UI).

If you can find a way to secure the requests even during the 14 day period, or anonymize them while allowing the developers to do their job, you can have my money today. I think privacy/data security is the #1 concern for me, especially if the agents will be supporting me in all kinds of personal tasks.

reissbaker 4 days ago | parent [-]

FWIW the 14 day retention is just to cover accidental log statements being deployed — we don't intentionally store API request prompts or completions after processing at all. We'll probably change our stated policy to no-store since in practice that's what we do (and we get this feedback a lot!)

IgorPartola 5 days ago | parent | prev | next [-]

Is there a possibility of my work leaning to others? Does your staff have the ability to view prompts and responses? Is tenancy shared with other users, or entities other than your company?

This looks really promising since I have also been having all sorts of issues with Claude.

reissbaker 4 days ago | parent [-]

We never train on your prompts or completions, and for the API we don't store longer than 14 days (in fact, we don't ever intentionally store API prompts or completions at all, the 14 day policy was originally just to cover accidental log statements being deployed; we'll probably change it to no-store since it's confusing to say 14 days when we actually don't intentionally store). For the web UI we do have to store, since otherwise we couldn't show you your message history.

In terms of tenancy: we have our own dedicated VMs for our Kubernetes cluster via Azure, although I suspect a VM is not equivalent to an entire hardware node. We use Supabase for our Postgres DB, and Redis for ephemeral data; while we don't share access to that to any other company, we don't create a new DB for every user of our service, so there is user multitenancy there. Similarly, the same GPUs may serve many customers — otherwise we'd need to charge enormous amounts for inference. But, the requests themselves aren't intermingled; i.e. if you make a request, it doesn't affect someone else's.

AlecSchueler 5 days ago | parent | prev [-]

How do you store/view the data I send you?

reissbaker 4 days ago | parent [-]

For API prompts or completions, we don't store after we return the completion to your prompt (our privacy policy allows us to store for a maximum of 14 days, just to cover accidental log statement deploys). For the web UI we store them in Postgres, since the web UI lets you view your message history and we wouldn't be able to serve that to you without storing it.

AlecSchueler 4 days ago | parent [-]

https://developer.mozilla.org/en-US/docs/Web/API/Window/loca...

reissbaker 3 days ago | parent [-]

Yeah, localStorage-only doesn't do things like sync across devices or persist if you lose your phone. But since we expose an OpenAI-compatible endpoint, if you don't care about those things there are plenty of LLM clients that will keep your data 100% on-device that you can use instead of the web UI.

billyjobob 5 days ago | parent | prev | next [-]

Both of those models have Anthropic API compatible endpoints, so you just set an environmental variable pointing to them before you run Claude Code.

5 days ago | parent | prev [-]
[deleted]