| ▲ | VibeBench: Measuring 1k Engineers' Opinions of New Models(vibebench.standardagents.ai) | |||||||
| 12 points by jpschroeder 2 days ago | 4 comments | ||||||||
| ▲ | mhi3 2 days ago | parent | next [-] | |||||||
"Published benchmarks are gamed, optimized, and overfit, and no longer yield a useful signal." Is this true? But I love this concept! | ||||||||
| ||||||||
| ▲ | ramon156 a day ago | parent | prev | next [-] | |||||||
Love the idea! Page is incredibly slow on mobile, probably the avatars | ||||||||
| ▲ | memoryleakgame 2 days ago | parent | prev [-] | |||||||
800 commits in a year... | ||||||||