| ▲ | armcat 4 hours ago | |
This is awful but hardly surprising. Someone mentioned reproducible code with the papers - but there is a high likelihood of the code being partially or fully AI generated as well. I.e. AI generated hypothesis -> AI produces code to implement and execute the hypothesis -> AI generates paper based on the hypothesis and the code. Also: there were 15 000 submissions that were rejected at NeurIPS; it would be very interesting to see what % of those rejected were partially or fully AI generated/hallucinated. Are the ratios comperable? | ||
| ▲ | blackbear_ 3 hours ago | parent [-] | |
Whether the code is AI generated or not is not important, what matters is that it really works. Sharing code enables others to validate the method on a different dataset. Even before LLMs came around there were lots of methods that looked good on paper but turned out not to work outside of accepted benchmarks | ||