Remix.run Logo
hmmmmmmmmmmmmmm 6 hours ago

Hamiltonian paths and previous work by Donald Knuth is more than likely in the training data.

red75prime an hour ago | parent [-]

The specific sequence of tokens that comprise the Knuth's problem with an answer to it is not in the training data. A naive probability distribution based on counting token sequences that are present in the training data would assign 0 probability to it. The trained network represents extremely non-naive approach to estimating the ground-truth distribution (the distribution that corresponds to what a human brain might have produced).