Remix clone Hacker News

new | show | ask | jobs Github

	▲	shadeslayer_ 5 hours ago
		Do these benchmarks even add any value at this point? This one is basically Cursor saying that their model is as good as the frontier ones at a fraction of the price. The independent benchmarks are probably part of training data now and the models are pattern-matching against them all the time. The final test of a model (and the harness, probably) is how good it works FOR YOU - since most of the models can pretty much do most of our tasks on a daily basis - it boils down to which one has the least friction to its usage.