| ▲ | sajithdilshan 2 hours ago | |
What would have been more interesting is if LLMs were tested with questions where the direct solutions are not publicly available (so not in training data). In that case I wonder how much of hallucinations would happen or if it tries to connect dots with what’s available publicly and come up with a direct solution | ||
| ▲ | christianstump an hour ago | parent [-] | |
I don't understand why you expect that an answer known to the researcher but which has never been published should be in the training data. You possibly missunderstand what these problems look like -- we made them all publicly available on the website, so please have a look: https://math.sciencebench.ai/benchmarks/benchmarks-in-leipzi... | ||