| ▲ | categoricalrift 4 hours ago | |
How about the very last "Kept Improvement" in the plot? It's titled "random seed 42 -> 137". I do think this project is quite conceptually interesting, but the model literally choosing a different random seed to achieve lower loss feels pretty far removed from the flowery sci-fi writing at the top of the readme. | ||
| ▲ | eternauta3k 2 hours ago | parent | next [-] | |
It shows that both Karpathy and the LLM have good taste in random seeds: the answer to life, the universe and everything, and ~1/(the fine structure constant) | ||
| ▲ | aix1 3 hours ago | parent | prev [-] | |
The 42 -> 137 also jumped out at me. On the face of it, the associated improvement sure does sound like overfitting to the eval set. | ||