Remix clone Hacker News

new | show | ask | jobs Github

	▲	maplethorpe 3 hours ago
		Thanks so much for this! I still haven't got around to building my own language model yet, so I'm a bit fuzzy on the details, but if I imagined a thought experiment where I did all the math by hand on paper, I just couldn't see how I would end up with a different output each time given the same inputs. Finding out that the variance other people are seeing comes from the server/hardware stuff clears that up. This is a surprisingly annoying question to Google. A lot of articles give the reason that softmax returns a probability distribution, as if the presence of the word "probability" means the tokens will be different every time.