Remix.run Logo
coffeefirst 3 hours ago

The other thing is as far as I can tell, the power-tool style modes where you use AI to boost focus and do quick research is both much cheaper and, by the time you account for all the damage control/prompt tuning/other externalities, roughly as fast as full agentic.

I suspect with the prices going up, that realization is going to be pretty appealing.

jorvi 2 hours ago | parent | next [-]

Prices are not going up. DeepSeek V4 Pro is 5-10x cheaper than Claude 4.7.

As some of us have been predicting, model capability had already mostly plateaued, and the Chinese have and will continue to relentlessly push cost down. Chinese models will be used for 95% of things, with nation-native models for security/sovereignity-sensitive workloads. Eventually (5+ years from now), efficiency gains and hardware progress will make running local models the dominant way of doing things.

And yes, that puts the investors of Claude and OpenAI in quite a pickle.

robotswantdata 2 hours ago | parent | next [-]

I want the frontier on prem to be true but IMO not good enough yet unless async.

What started as all-you-can-eat $50 buffet has quietly become a $6k bill, frontier models that don’t ship your codebase to Beijing don’t come cheap anymore.

jorvi 2 hours ago | parent [-]

> I want the frontier on prem to be true

> that don’t ship your codebase to Beijing

DeepSeek is open weight, open source.

beej71 an hour ago | parent [-]

> DeepSeek is open weight, open source.

Sure, but I can't currently afford the hardware to run the frontier model.

underdeserver 26 minutes ago | parent [-]

Okay but that hardware doesn't have to be in Beijing, it can be in a Dutch data centre.

coffeefirst 2 hours ago | parent | prev [-]

Yes, that’s basically what I’m talking about. Without subsidies, that’s a lot of incentive to use more efficient/open weight models, but also to use them in ways that are less compute intensive—fewer tokens, shorter reasoning chains, less nonstop, and generally tie up less hardware for a lot less time.

slopinthebag 2 hours ago | parent | prev [-]

I agree, I also think there are long term costs to agentic coding that we won't see yet for a while, gradually and then suddenly.

It feels like nobody is even trying to iterate on the "power-tool style" usage of language models, everyone jumped straight to agents. It's not clear to me that removing a human from the loop is strictly more efficient though. Imagine an editor with an embedded language model (either running locally or using cheap cloud models) that is constantly churning, analysing code, reading debug logs, offering suggestions. Refactoring is not b̴e̴g̴g̴i̴n̴g̴ asking a chat interface to make changes, but structured refactors utilising the language models powered by the underlying ast representations to pull off much larger or involved refactors that surpass the abilities of current IDEs. Or doing codegen in your editor in a more structured (and thus repeatable) way compared to an agent spitting out code that you have to then review.