▲ | adastra22 4 days ago | |
That is not how training works… | ||
▲ | echelon 4 days ago | parent [-] | |
That's how model distillation works. DeepSeek is the most notable case, but it's been used lots. And the foundation model companies are scraping and exfiltrating each others' data. |