Remix.run Logo
ngrislain 4 days ago

Thank you! The number is the the sum of the logprobs from the token constituting the individual values. So it does represent the likelihood of seeing this value. So yes OpenAI is super-biased as a random number generator. We sampled other values from OpenAI and got other die roll values, but with much lower probs (5 has 8% chances ).

ngrislain 4 days ago | parent [-]

More precisely it represents the likelihood of seeing this value conditional on the tokens before it.

elcritch 3 days ago | parent | next [-]

Even without other tokens before it the LLM is probably showing the probability of dice rolls based on its training data. I’d guess humans tend to prefer “3” or “4” as it’s nearer the avg/median and feels fairer.

AFAICT, the LLMs aren’t creating new mental mappings of “dice are a symmetric and should give equal probability to land on any side followed by using that info to infer they should use a RNG.”

radarsat1 3 days ago | parent | prev [-]

and i guess includes other possibilities than numbers, like 'f' which could lead to four or five. There's probably a separate probability for 'fi' and 'fo' too.