▲ | Centigonal 4 days ago | |
LLMs produce a probability distribution for what the next token might be. How you pick the actual word that is printed next from that probability distribution is by using a sampling approach[1]. If your sampling approach is "select the next word randomly from among the top 4 possibilities" and you flip a > sign, you could end up with the behavior described in the OP. [1] Here is an example of two common approaches: https://www.reddit.com/r/AIDungeon/comments/1eppgyq/can_some... | ||
▲ | jjmarr 4 days ago | parent [-] | |
The next word can also be selected with weighted randomization and "temperature" is used to control how much weight lower probability tokens get. I've honestly received the best results in creative writing by ignoring top_k/top_p and simply tuning temperature. Restricting my output to only common words causes everything to feel generic. But Deepseek constantly breaks into Chinese/gibberish/ZALGO! when I go to 1.14. This isn't related to the "recent issues" but I feel like it's useful advice for anyone trying out AI story creation. |