| ▲ | umanwizard an hour ago | |
Please define what "predicting the next token" means. The next token according to what probability distribution? Couldn't every process that produces text (including humans writing) be modeled as predicting the next token according to some distribution? | ||