Remix clone Hacker News

new | show | ask | jobs Github

	▲	dakolli 3 hours ago
		Reminder the only benchmark that really matters is the one that measures the ability for the model to do real world tasks that someone would pay for on Upwork that would take ~12 hrs for a human to do. The best model has a < 5% pass rate. These are incredibly simple jobs that you wouldn't pay much for. These things fail miserably. Stop falling for this dumb marketing, these things are legitimately useless in the real world unless you love mediocrity and have no standards. https://labs.scale.com/leaderboard/rli Stop frying your brain with these useless tools, reducing your output to the mean. You people are betting your competency on the quality and quantity of tokens you'll have access to.. which guess what, so that will be the same as everyone else. There are handmade watchmakers in Switzerland, and mass manufacturers of watches in Asia. Who is more valuable as individual, the guy who knows how to push the buttons on a conveyor belt in Vietnam or the guy who makes one watch a month in Switzerland? Your vibe coded slop isn't impressive either, sorry. None of it.
	▲	jhatemyjob an hour ago \| parent [-]
		I agree with your sentiment but I think a fairer comparison would be: > Who is more valuable as individual, the owner of a watch factory in Vietnam or the guy who makes one watch a month in Switzerland? With that framing, I'm not sure what the answer is. I suppose it depends on your priorities