Remix.run Logo
neya 3 days ago

I am a paying customer with credits and the API endpoints rate-limited me to the point where it's actually unusable as a coding assistant. I use a VS Code extension and it just bailed out in the middle of a migration. I had to revert everything it changed and that was not a pleasant experience, sadly.

square_usual 3 days ago | parent | next [-]

When working with AI coding tools commit early, commit often becomes essential advice. I like that aider makes every change its own commit. I can always manicure the commit history later, I'd rather not lose anything when the AI can make destructive changes to code.

webstrand 3 days ago | parent [-]

I can recommend https://github.com/tkellogg/dura for making auto-commits without polluting main branch history, if your tool doesn't support it natively

teaearlgraycold 3 days ago | parent | prev | next [-]

Why not just continue the migration manually?

htrp 3 days ago | parent | prev | next [-]

Control your own inference endpoints.

its_down_again 3 days ago | parent [-]

Could you explain more on how to do this? e.g if I am using the Claude API in my service, how would you suggest I go about setting up and controlling my own inference endpoint?

handfuloflight 3 days ago | parent | next [-]

You can't. He means by using the open source models.

datavirtue 3 days ago | parent | prev [-]

Runa local LLM tuned for coding on LM Studio. It has a server and provides endpoints.

datavirtue 3 days ago | parent | prev [-]

You aren't running against a local LLM?

TeMPOraL 3 days ago | parent | next [-]

That's like asking if they aren't paying the neighborhood drunk with wine bottles for doing house remodeling, instead of hiring a renovation crew.

rybosome 3 days ago | parent | next [-]

That’s funny, but open weight, local models are pretty usable depending on the task.

TeMPOraL 3 days ago | parent [-]

You're right, but that's also subject to compute costs and time value of money. The calculus is different for companies trying to exploit language models in some way, and different for individuals like me who have to feed the family before splurging for a new GPU, or setting up servers in the cloud, when I can get better value by paying OpenAI or Claude a few dollars and use their SOTA models until those dollars run out.

FWIW, I am a strong supporter of local models, and play with them often. It's just that for practical use, the models I can run locally (RTX 4070 TI) mostly suck, and the models I could run in the cloud don't seem worth the effort (and cost).

alwayslikethis 3 days ago | parent | next [-]

For the money for a 4070ti, you could have bought a 3090, which although less efficient, can run bigger models like Qwen2.5 32b coder. Apparently it performs quite well for code

rjh29 3 days ago | parent | prev [-]

I guess the cost model doesn't work because you're buying gpu that you use about 0.1% of the day

neumann 3 days ago | parent | prev [-]

That's what my grandma did in the village in Hungary. But with schnapps. And the drunk was also the professional renovation crew.

rty32 3 days ago | parent | prev [-]

Not everyone has a 4090 or M4 Max at home.