Remix.run Logo
ehnto a day ago

I think you might be misunderstanding the article actually, this is about AI solving tasks as measured by how long it takes a human to solve the task. The AI could potentially solve it much quicker, but the use of "human time to solve" is an attempt to create a metric that reveals long horizon complexity (as I understand it anyway).

It's interesting because like the article notes, AI is really smashing benchmarks, but actual usefulness in automation of thought work is proving much more elusive. I think that collective experience of AI just not being that useful, or as useful as benchmarks suggest it should be, is captured in this metric.

rishabhaiover 14 hours ago | parent [-]

I've practiced a healthy skepticism of the recent boom but I can't reason why the long horizon time wouldn't stretch to 8 hours or a week worth's of effort from next year. After Opus-4.5, governments and organizations should really figure out a path out of this storm because we're in it now.

theptip 5 hours ago | parent [-]

Doubling time has been 7 months for a while, so you should expect 8h not 1 week next year.

rishabhaiover an hour ago | parent | next [-]

Predictions over historical data in a landscape with fragile priors doesn't seem like a strong metric to me (it's a useful approximation at best)

dwohnitmok 2 hours ago | parent | prev [-]

It's significantly accelerated to 4 months since the beginning of 2025, which puts 1 week within reach if things stay on trend. But yes 7 months is the more reliable long-term trend.