Remix clone Hacker News

new | show | ask | jobs Github

	▲	hackinthebochs 2 hours ago
		They are deterministic in the sense that the inference process scores every word in the vocabulary in a deterministic manner. This score map is then sampled from according to the temperature setting. Non-determinism is artificially injected for ergonomic reasons. >But I think there’s still the question if this process is more similar to thought or a Markov chain. It's definitely far from a Markov chain. Markov chains treat the past context as a single unit, an N-tuple that has no internal structure. The next state is indexed by this tuple. LLMs leverage the internal structure of the context which allows a large class of generalization that Markov chains necessarily miss.