Remix.run Logo
lazarus01 2 days ago

In response to your direct question -> https://gail.wharton.upenn.edu/research-and-insights/tech-re...

“ This indicates that while CoT can improve performance on difficult questions, it can also introduce variability that causes errors on “easy” questions the model would otherwise answer correctly.”

Other response to strawberry example; There are 25,000 people employed globally that repair broken responses and create training data, a big whack-a-mole effort to remediate embarrassing errors.

CamperBob2 2 days ago | parent [-]

(Shrug) Ancient models are ancient. Please provide specific examples that back up your point, not obsolete .PDFs to comb through.