|
| ▲ | sigbottle 10 hours ago | parent | next [-] |
| What would unsupervised mean, would unsupervised be something like alphago playing against itself trillions of times? Whereas self-supervised, allows learning without explicit annotation of data ; but it doesn't matter if the models already trained on the entire Internet, and it's not like a game where it can come up with effectively new training data for itself? |
| |
| ▲ | jmalicki 6 hours ago | parent [-] | | Unsupervised is basically clustering. Alphago is RL - winning or losing a game is a form of supervision. Unsupervised is something where there is no intrinsic reward signal. In pre training, predicting the next token and seeing that it matches is a reward signal, hence it is self supervised. |
|
|
| ▲ | cold_harbor 9 hours ago | parent | prev [-] |
| fair point — OpenAI's original plan literally said "solve unsupervised learning". the self-supervised distinction wasnt really standard til after BERT/GPT popularized it |
| |
| ▲ | jmalicki 5 hours ago | parent [-] | | I think it's an extremely important distinction because self supervised learning has real inherent reward signals. Something like clustering does not. |
|