Remix.run Logo
yorwba 5 hours ago

The amount of data Anthropic has claimed was extracted for distillation is tiny in comparison to the entire internet, which is right there for the taking and holds most of the knowledge people expect models to have.

Distilling even with small amounts of data from a better model is still helpful, but not in the sense of transferring capabilities the raw internet-trained model doesn't have at all, but for identifying those capabilities that are compatible with the servile assistant persona and suppressing others that are undesirable (e.g. trolling). A primitive version of this were instruction-tuning datasets generated with ChatGPT, as used e.g. for Alpaca.

Without a clear target to emulate, competitors might have to rely more on human raters, but there are plenty of data labeling companies in China, so that's hardly a hurdle.

christina97 an hour ago | parent [-]

[dead]