| ▲ | river_otter 10 hours ago | |
One thing from the podcast that jumped out to me was the statement that in pre training "you don't have to think closely about the data". Like I guess the success of pre training supports the point somewhat but it feels to me slightly opposed to Karpathy talking about what a large percentage of pretraining data is complete garbage. I guess I would hope that more work in cleaning the pre training data would result in stronger and more coherent base models. | ||