| ▲ | fancyfredbot 2 hours ago | |
Wow that is terrible. In my memory GPT 2 was more interesting than that. I remember thinking it could pass a Turing test but that output is barely better than a Markov chain. I guess I was using the large model? | ||
| ▲ | sillysaurusx 27 minutes ago | parent | next [-] | |
There’s an art to GPT sampling. You have to use temperature 0.7. People never believe it makes such a massive difference, but it does. | ||
| ▲ | wat10000 an hour ago | parent | prev | next [-] | |
Probably a much better prompt, too. I just literally pasted in the top part of my comment and let fly to see what would happen. | ||
| ▲ | daveguy 2 hours ago | parent | prev [-] | |
Here is the XL model. 20x the size of the medium model. Still just 2B parameters, but on the bright side it was trained pre-wordslop. | ||