Remix.run Logo
DGoettlich 2 hours ago

Also one of our fears. What we've done so far is to drop docs where the datasource was doubtful about the date of publication, if there are multiple possible dates we take the latest to be conservative. During training, we validate that the model learns pre- but not post-cutoff facts. https://github.com/DGoettlich/history-llms/blob/main/ranke-4...

If you have other ideas or think thats not enough, I'd be curious to know! (history-llms@econ.uzh.ch)