Remix.run Logo
HarHarVeryFunny 2 hours ago

I don't see many companies being willing to pay 3x more for faster code generation. Cloud-based AI code generation is already extremely fast, and hardly the bottleneck for most software product development.

There can't be many normal use cases where there'd be any cost benefit.

fragmede an hour ago | parent [-]

The "traditional" way we vibe code is human software developer prompts AI -> AI generates code -> (human checks code) -> code gets compiled/deployed/etx -> users use "binary". At the speed of 1000 tok/sec, user prompts obliquely -> AI vets generated code -> code deployed -> user gets response from deployed code.

It's a cute toy right now, but you can tell an LLM that it's an http server, and have it respond directly to a web browser hitting it. It generates headers in response, as well as page contents. As 1000 tok/sec becomes three new normal, we will come up with newer ways to use it outside of toy fiction encyclopedias.

HarHarVeryFunny an hour ago | parent [-]

1000 tokens per sec is still massively slower than serving a normal web page - if something doesn't respond in a few seconds many people give up.

I'm not saying there aren't any use cases for super-fast (and super-expensive) generation, but it does seem a bit niche. If it was free then sure faster is better, but what are the mainstream use cases where people might pay 3x more for a faster version of something that is already fast?

I think it would have to be an application where it paid for itself - where the 10x faster response was actually worth more than 3x the cost to you - where the extra speed was worth the extra cost.