| ▲ | mpesce 12 hours ago | |
AI evals are becoming intractable for the same reasons that measuring human general intelligence has been intractable for 150 years: the constructs resist decomposition, the benchmarks saturate, and the goalposts move. The reasons are the same because the thing being measured is the same. The difficulty is the evidence. | ||