Remix clone Hacker News

new | show | ask | jobs Github

	▲	the_harpia_io 3 hours ago
		honestly the harness thing is way more important than people realize - I've been working on code security tools and the gap between what a model generates raw vs with better structure is massive, way bigger than model versions mattering. like the security bugs I see in AI code, half of them are just because the prompt didn't include enough context or the edit format was wonky the benchmark overselling isn't the point though - it's that we're barely using these things right. most people still chat with them like it's 2023. what happens when you combine this with actual review flows not just 'beat swe-bench' idk I think everyone's too focused on the model when tooling matters more, since that's something you can actually control