Remix clone Hacker News
new
|
show
|
ask
|
jobs
Github
▲
fulafel
2 days ago
So GDPval is OpenAI's own benchmark. PDF link:
https://arxiv.org/pdf/2510.04374