| ▲ | purple-leafy 4 hours ago | |||||||||||||||||||||||||
Benchmarks are great, but I feel like there’s a better way this seems quite subjective. What you really need is an objective benchmark | ||||||||||||||||||||||||||
| ▲ | eli 4 hours ago | parent | next [-] | |||||||||||||||||||||||||
I actually really like subjective benchmarks, so long as it's a human (ideally me) grading the results. LLM as judge never made much sense. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | echelon 4 hours ago | parent | prev [-] | |||||||||||||||||||||||||
> What you really need is an objective benchmark "When are all the software engineers unemployed?" | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||