Remix clone Hacker News

new | show | ask | jobs Github

	▲	moritz 4 hours ago
		https://arxiv.org/abs/2508.09101 In this benchmark, models can correctly solve Rust problems 61% on first pass — A far cry from other languages such as C# (88%) or Elixir (a “buggy dynamic language”) where they perform best (97%). I wonder why that is, it’s quite surprising. Obviously details of their benchmark design matter, but this study doesn’t support your claims.
	▲	squeegmeister 3 hours ago \| parent [-]
		This is great, but aug 2025 is almost a lifetime ago with how fast these models are improving. Opus 4.5 came out November 2025 fwiw