▲ | energy123 4 days ago | |
Basic prose is a saturated bench. You can't go above 100% so by definition progress will stall on such benchmarks. | ||
▲ | mattw1810 4 days ago | parent [-] | |
All the same they choose to highlight basic prose (and internal knowledge, for that matter) in their marketing material. They’ve achieved a lot to make recent models more reliable as a building block & more capable of things like math, but for LLMs, saturating prose is to a degree equivalent to saturating usefulness. |