Remix clone Hacker News
new
|
show
|
ask
|
jobs
Github
▲
Why LLM-as-judge fails for code evaluation. Here's what works.
(
navigara.medium.com
)
2 points
by
alienll
10 hours ago