Remix.run Logo
Why LLM-as-judge fails for code evaluation. Here's what works.(navigara.medium.com)
2 points by alienll 10 hours ago