Depends on your definition of knowing. Sure, we know it is predicting next tokens, but do we understand why they output the things they do? I am not well versed with LLMs, but I assume even for smaller modles interpretability is a big challenge.

▲

chongli 5 days ago | parent | next [-]

The answer is simple: the set of weights and biases comprise a mathematical function which has been specifically built to approximate the training set. The methods of building such a function are very old and well-known (from calculus).

There's no magic here. Most of people's awestruck reactions are due to our brain's own pattern recognition abilities and our association of language use with intelligence. But there's really no intelligence here at all, just like the "face on Mars" is just a random feature of a desert planet's landscape, not an intelligent life form.

▲

lazide 5 days ago | parent | prev [-]

For any given set of model weights and inputs? Yes, we definitely do understand them.

Do we understand the emergent properties of almost-intelligence they appear to present, and what that means about them and us, etc. etc.?

No.

	▲	jvanderbot 5 days ago \| parent [-]
		Right. The machine works as designed and it's all assembly instructions on gates. The values in the registers change but not the instructions. And it happens to do something weirdly useful to our own minds based on the values in the registers.