Correct and satisfying answers is not the loss function of LLMs. It's next token prediction first.

Thanks for correcting; I know that "loss function" is not a good term when it comes to transformer models.

Since I've forgotten every sliver I ever knew about artificial neural networks and related basics, gradient descent, even linear algebra... what's a thorough definition of "next token prediction" though?

The definition of the token space and the probabilities that determine the next token, layers, weights, feedback (or -forward?), I didn't mention any of these terms because I'm unable to define them properly.

I was using the term "loss function" specifically because I was thinking about post-training and reinforcement learning. But to be honest, a less technical term would have been better.

I just meant the general idea of reward or "punishment" considering the idea of an AI black box.

	▲	nearbuy 3 hours ago \| parent [-]
		The parent comment probably forgot about the RLHF (reinforcement learning) where predicting the next token from reference text is no longer the goal. But even regular next token prediction doesn't necessarily preclude it from also learning to give correct and satisfying answers, if that helps it better predict its training data.