| ▲ | jmuguy 2 hours ago | |
Are you arguing that the output of an LLM isn’t random? | ||
| ▲ | mpyne 2 hours ago | parent [-] | |
It is random if you select it to be (temperature != 0, etc.). It is not random if you don't use random sampling in its output generation. It the whole thing were actually stochastic then prompt caching would be impossible because having a cache of what the previous tokens transformed into to speed up future generation would be invalidated by the missing random state. Look at llama.cpp, you can see what samplers are adjustable and if you use samplers that employ randomness you can see what settings disable the random sampling. Or you can employ randomness but fix the seed to get reproducible results. | ||