> Humanity did exactly that though
No, it mostly didn't, it continued (continues, as every human is continuously interlacing “training” and “inferencing”) training on large volumes of ground truth for a very long time, including both natural and synthetic data; it didn't reason everything beyond some basic training on first principles.
At a minimum, something that looks broadly like one of today's AI models would need either a method of continuously finetuning its own weights with a suitable evaluation function or,if it was going to rely on in-context learning, would need many orders of magnitude larger context, than any model today.
And that's not a “this is enough to likely work” thing, but “this is the minimum for the their to even be a plausible mechnanism to incorporate the information necessary for it to work” one.