> Other companies were allegedly distilling the models by training on the reasoning output

In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.

▲

nullc a day ago | parent [-]

In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.

▲

dragonwriter a day ago | parent [-]

The word “openly” in my post there for a reason; the commercial models are not openly distilled from competitors: many open source models have in their model documentation that distillation was done from a dataset drawn from specific other models, including commercial models.

That distillation might be inferred from the behavior of commercial models is not the same as them openly doing it.

	▲	nullc a day ago \| parent [-]
		Fair enough!