Remix.run Logo
motbus3 2 hours ago

With no details, a bird told me of a project which estimated using several millions of tokens per day to automate a team work which got laid off. The operation is now a mess, there is no one willing to be considered liable and since the cheap model they used is about to be retired the company is going to see a 4x increase in price at least.

I have the feeling that the age of 'i can't be blamed by AI stuff' will be a "this was the computer guy mistake" for a moment.

PS. I've been using Claude opus 4.8 and it is worse than 4.6 and I will say that even sonnet 4.6 is better. PhD. Level of software and engineering I believe! I know many PhD who never coded or worked anyway

RamblingCTO 42 minutes ago | parent | next [-]

Glad I'm not the only one. Almost every factual thing with new opus is wrong (and it now even happens with 4.6?). I asked it about car stuff yesterday and it totally misrepresented how a car axle even looks like fundamentally. Today I talked about my CV and it was just plain wrong. I don't know what happened, it wasn't like this a few weeks back and I'm even considering cancelling claude alltogether. GPT 5.5 for coding is fine and way more stable, but regular work is just broken.

user_7832 an hour ago | parent | prev | next [-]

On the topic of older (Claude) models being better... anyone knows anything close to 3.5 (or 3.6) era Sonnet? It was by far the best LLM I had ever asked my doubts too. It actually explained in a human way, not like some AI I need to re read thrice to understand.

(I've used modern Gemini 3.1 pro & claude too. Modern ChatGPT is just as useless, I've never heard a human speak in points. The human brain never encounters that irl.)

Chu4eeno an hour ago | parent [-]

This was obviously a conscious choice from the leadership at he frontier labs, and especially OpenAI, considering how 4o turned out.

I don't think they expected the ELIZA effect [0] to explode as much as it did when they started including feedback directly from users into posttraining the next generation, so to be safe they've likely added several regimens of synthetic data ensuring ChatGPT tries to steer away from ELIZA.

[0]: https://en.wikipedia.org/wiki/ELIZA_effect

2 hours ago | parent | prev | next [-]
[deleted]
prodigycorp an hour ago | parent | prev | next [-]

To me this is clearly a skill issue. Several millions of tokens per day is peanuts, even if uncached. gpt-5.5 is $5 per million of input tokens.

Anybody doing things seriously understand how to optimize their workflows for smaller models once they start to lock in processes.

zozbot234 an hour ago | parent [-]

The expensive tokens are output, not input. A useful rule of thumb is that a million tokens per day means about ~10 tok/s on a 24/7 basis.

prodigycorp an hour ago | parent [-]

Even then, i highly doubt any sort of automation is producing on the order of several millions of tokens daily. The issue I see with the org in parent comment seems to stem from management and not any sort of token repricing.

platinumrad 2 hours ago | parent | prev [-]

I don't doubt that the operation as a whole is a disaster, but they should be able to avoid the price increase by using one of the many other cheap models like DeepSeek V4 Flash right?