| ▲ | pelario 6 hours ago | |
From the article: "As temperature approaches zero from the negative side, the model output will again be deterministic — but this time, the least likely tokens will be output." I understand this as, a negative number far from zero is also quite random (just with a distribution that will produce unlikely tokens). | ||
| ▲ | -_- 4 hours ago | parent [-] | |
Yep! Very large negative temperatures and very large positive temperatures have essentially the same distribution. This is clearer if you consider thermodynamic beta, where T = ±∞ corresponds to β = 0. | ||