| ▲ | NiloCK 19 hours ago | |
I think the headline oversells this a little? The reported variance in Sonnet 4.6's estimates here are actually quite low, and in general terms, not so bad across models. Damn paella. This does seem like a task well suited to a for-purpose training run against a bunch of labelled data. Is there any reason they wouldn't improve at it? | ||