Remix clone Hacker News

new | show | ask | jobs Github

	▲	OvrUndrInformed 2 days ago
		Markov Processes are a pretty general concept that can be used to model just about anything if you let the “state” also incorporate some elements of the “history”. I assume that the way transformers are used to model language (next token prediction) can be considered a Markov process where the transition function is modeled by the LLM. Transitions between a state (given by [n] previous tokens, which are the context + text generated so far) and the next state (given by state[n+1] tokens, which are the context + text generated so far + the newest generated token) are given by the probability distribution output by the LLM. Basically I think you can consider auto-regressive LLMs as parameterized Markov processes. Feel free to correct me if I’m wrong.