| ▲ | sosodev 2 hours ago | |
I wonder if model distillation will continue to work as well as it has. Given hidden reasoning, the ever expanding number of expected capabilities, a serious compute shortage, the looming possibility of model collapse, and dramatically higher API costs I would guess that it's getting much harder to do. | ||
| ▲ | gck1 a minute ago | parent | next [-] | |
You should check out some Chinese forums. There are services selling gateways/proxies for all major models at fraction of the official rates. Likely reselling subscriptions, or some other form of abuse. I've seen people posting screenshots of billions of tokens consumed where they paid next to nothing. These same gateways are likely also reselling the data to Chinese labs, because TLS has to terminate at the gateway level. | ||
| ▲ | sourcecodeplz an hour ago | parent | prev [-] | |
Asian labs generated synthetic datasets from UBS labs but also innovated with technology. Now it is harder to get the thinking traces AND Anthropic is recorded to poison it as well. Thus Asian labs will have to generate their own data sets, which with the huuuuge usage boom from deepseek, mimo, kimi, etc, they will be able to. | ||