| ▲ | reactordev 6 hours ago | ||||||||||||||||
Turn down the temperature and you’ll see less “simpler” short cuts. | |||||||||||||||||
| ▲ | smokel 5 hours ago | parent [-] | ||||||||||||||||
For the uninitiated: Interestingly, it is not advisable to take this to the extreme and set temperature to 0. That would seem logical, as the results are then completely deterministic, but it turns out that a suboptimal token may result in a better answer in the long run. Also, allowing for a little bit of noise gives the model room to talk itself out of a suboptimal path. | |||||||||||||||||
| |||||||||||||||||