▲ | mdp2021 4 days ago | ||||||||||||||||
But when an LLM can fail though having all the time in the world, you are pretty certain you hit a wall. So, in a way you have defined a good indicator for a limit for a certain area. | |||||||||||||||||
▲ | sigmoid10 4 days ago | parent [-] | ||||||||||||||||
There is not enough sampling here to reach this conclusion. Remember, you can crank things like o3 pretty high on tasks like ARC AGI if you're willing to spend thousands of dollars on inference time compute. But that's obviously not in the budget for an enthusiast site like this. | |||||||||||||||||
|