Remix clone Hacker News

new | show | ask | jobs Github

	▲	dTal 3 hours ago
		I am quite certain. The output is "just tokens"; the "position encodings" and "context" are inputs to the LLM function, not outputs. The information that a token can carry is bounded by the entropy of that token. A highly predictable token (given the context) simply can't communicate anything. Again: if a tiny language model or even a basic markov model would also predict the same token, it's a safe bet it doesn't encode any useful thinking when the big model spits it out.
	▲	Chance-Device 3 hours ago \| parent [-]
		I just don’t share your certainty. You may or may not be right, but if there isn’t a result showing this, then I’m not going to assume it.