| ▲ | eru 7 hours ago | |||||||
No. That's wrong. LLMs don't output the highest probability taken: they do a random sampling. | ||||||||
| ▲ | storus 7 hours ago | parent | next [-] | |||||||
This was obviously a simplification which holds for zero temperature. Obviously top-p-sampling will add some randomness but the probability of unexpected longer sequences goes asymptotically to zero pretty quickly. | ||||||||
| ||||||||
| ▲ | 6 hours ago | parent | prev [-] | |||||||
| [deleted] | ||||||||