Remix.run Logo
ur-whale a day ago

When you have no moat, you have to try and find desperate ways to manufacture one.

anuramat a day ago | parent [-]

wdym?

singron a day ago | parent | next [-]

Other companies were allegedly distilling the models by training on the reasoning output. By hiding the reasoning tokens, it makes it harder to do this. You can still try to distill the models, but you can't distill reasoning itself as well.

This could all be optics as well to try to give the appearance of a defensible moat. E.g. they can claim to investors that they are able to protect a significant chunk of their intellectual property this way. I'm not sure if anyone has a study about how significant the summarization is to distillation.

dragonwriter a day ago | parent [-]

> Other companies were allegedly distilling the models by training on the reasoning output

In the case of makers of open-source models (which are also competition), there is no allegedly, they were (and still are) openly doing that.

nullc a day ago | parent [-]

In the case of the closed models too... Claude would happily tell you it was deepseek-v3 if you asked in chinese until it caught public attention and they papered over it.

dragonwriter a day ago | parent [-]

The word “openly” in my post there for a reason; the commercial models are not openly distilled from competitors: many open source models have in their model documentation that distillation was done from a dataset drawn from specific other models, including commercial models.

That distillation might be inferred from the behavior of commercial models is not the same as them openly doing it.

nullc a day ago | parent [-]

Fair enough!

ur-whale a day ago | parent | prev [-]

> wdym?

https://en.wikipedia.org/wiki/Economic_moat

anuramat a day ago | parent [-]

how is summarized CoT a moat, and how is having the top 2 LLMs not a moat?

Closi a day ago | parent | next [-]

If you have the full outputs, it might make it easier for competitors to distil the model or reverse engineer the full process.

It may also be that misaligned responses can be in CoT which OpenAI does not want to show to users.

anuramat a day ago | parent [-]

but "harder to reverse engineer" isn't manufacturing, that's protecting your moat

Closi a day ago | parent [-]

What is a moat if not something used to protect the castle?

In this case it stops people copying your IP

dragonwriter a day ago | parent | prev [-]

Not revealing actual thinking traces prevents mdoel distillation on yhe actual output (thinking traces are a key part of the output) which makes it harder for conpetitors to catch up (a moat).

Being currently in the lead in a category is not a moat,a moat is whatever creates a barrier to competitors catching up when you are in the lead. Merely being in the lead is not a moat except in a market with strong network externalities.

anuramat a day ago | parent [-]

unrestricted access to better models at compute prices = better synthetic data and faster research, so its not just about the product imho