| ▲ | crooked-v 5 hours ago | |
Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that. | ||
| ▲ | majormajor 4 hours ago | parent [-] | |
> Complicated-enough LLMs also are aboslutely doing a lot more than "just trying to predict the next word", as Anthropic's papers investigating the internals of trained models show - there's a lot more decision-making going on than that. Are there newer changes that are actually doing prediction of tokens out of order or such, or are this a case of immense internal model state tracking but still using it to drive the prediction of a next token, one at a time? (Wrapped in a variety of tooling/prompts/meta-prompts to further shape what sorts of paragraphs are produced compared to ye olden days of the gpt3 chat completion api.) | ||