Remix.run Logo
sophiabits 4 hours ago

> You can't fire Claude if it fucks up

What's the difference between "firing" Claude vs moving to a model from a different provider? The latter seems very analogous to firing an employee for performance and backfilling with someone new.

Re the rest, it's just not my experience that models become incapable of making good decisions in cases where input token count > the context window, but ymmv based on domain.

A very extreme example of this: a couple years ago when GPT 4 was state of the art and the 32K context variant was gated to design partners I worked at an EdTech company in the college admissions space that wanted to produce quarterly reports on student progress for parents. That involved crunching a LOT of data (multiple hours of meeting transcripts per week, very detailed notes about student activities, their general profile - UK and US admissions function very differently!)

It was a difficult problem, but we _did_ manage to produce these reports 4K output tokens at a time at a level of quality that exceeded what humans could do internally, and models+the surrounding tooling have only gotten better since then.

logicchains 3 hours ago | parent [-]

>What's the difference between "firing" Claude vs moving to a model from a different provider? The latter seems very analogous to firing an employee for performance and backfilling with someone new.

A human may learn and improve to avoid being fired, while Claude is incapable of that.

>Re the rest, it's just not my experience that models become incapable of making good decisions in cases where input token count > the context window, but ymmv based on domain.

If they've been trained a lot on your domain (maths, coding) then they can make good decisions. But I've just started using Mythos and even it makes some awful decisions in domains it's not trained on. Of course the majority of decisions are good, but it only takes a couple bad ones to sink a project.