Remix.run Logo
trouve_search 4 hours ago

A lot of benchmarks are setup to not punish false positives (irrelevant answers or extra text) and punish false negatives (missing the snippet being looked for).

This leads to answer bloat and/or hallucination if you benchmaxx on those