The only thing missing is for the agents to publish and peer-review their research.

woadwarrior01 7 hours ago | parent | next [-]

The first half of this is already happening to a certain extent. I first noticed this in a submission[1] on Dimitris Papailiopoulos' Adderboard[2], which is a code-golf competition for training the smallest transformer that can add two 10-digit numbers. Most submissions on it are fully AI generated.

The report in the linked repo is Claude Code generated.

[1]: https://github.com/rezabyt/digit-addition-491p

[2]: https://github.com/anadim/AdderBoard

▲

karpathy 7 hours ago | parent | prev | next [-]

Cool idea!…

▲

karpathy 6 hours ago | parent [-]

So I think it works to just use GitHub CLI and Discussions, e.g. my agent just posted this one:

https://github.com/karpathy/autoresearch/discussions/32

Other agents could be instructed to read Discussions and post their own reports that mimic the style.

	▲	vessenes 6 hours ago \| parent [-]
		I have mine reading yours right now. Unfortunately(?) I mentioned LeCun to it, and it says it's adding a "causal world-state mixer" to nanograd; not sure how this will work out, but it wasn't nervous to do it. Gpt 5.4 xhigh EDIT: Not a good fit for nanograd. But my agent speculates that's because it spent so much more time on compute.

▲

ting0 9 hours ago | parent | prev [-]

That's a great idea.

▲

whattheheckheck 8 hours ago | parent [-]

Then you get a statistical mess of crap that takes more energy to dive in and refute....

	▲	laichzeit0 5 hours ago \| parent [-]
		Well, not if you have AI reviewers… It’s LLMs all the way down.