| ▲ | wood_spirit 4 days ago | |
A decoder predicts the next word (token) to iteratively generate a whole sentence. An encoder masks a word in the middle of a sentence and tries to predict that middle. The original transformer paper from google was encoder-decoder, but then encoder BERT was hot and then decoder GPT was hot; now encoder-decoder is hot again! Decoders are good at generative tasks - chatbots etc. Encoders are good at summaration. Encoder decoders are better at summaration. It’s steps towards “understanding” (quotes needed). | ||