Thats because humans have "understanding" they can use to assess quality, without understanding "trying harder" just means spending more "effort" distilling an average result, at best over a larger sample size.