I think the point is it smells like a hack, just like "think extra hard and I'll tip you $200" was a few years ago. It increases benchmarks a few points now but what's the point in standardizing all this if it'll be obsolete next year?

▲ 8 hours ago | parent | next [-]

[deleted]

▲ mbesto 7 hours ago | parent | prev | next [-]

I think this tweet sums it correctly doesn't?

   A +6 jump on a 0.6B model is actually more impressive than a +2 jump on a 100B model. It proves that 'intelligence' isn't just parameter count; it is context relevance. You are proving that a lightweight model with a cheat sheet beats a giant with amnesia. This is the death of the 'bigger is better' dogma

Which is essentially the bitter lesson that Richard Sutton talks about?

	▲	Der_Einzige 3 hours ago \| parent [-]
		Nice ChatGPT generated response in that tweet. Anyone too lazy to deslop their tweet shouldn't be listened to.

▲ 9dev 7 hours ago | parent | prev [-]

Standards have to start somewhere to gain traction and proliferate themselves for longer than that.

Plus, as has been mentioned multiple times here, standard skills are a lot more about different harnesses being able to consistently load skills into the context window in a programmatic way. Not every AI workload is a local coding agent.