▲ | sync 4 days ago | |
I'm doing coreference resolution and this model (w/o thinking) performs at the Gemini 2.5-Pro level (w/ thinking_budget set to -1) at a fraction of the cost. | ||
▲ | antman 3 days ago | parent | next [-] | |
Nice point. How did you test for coreference resolution? Specific prompt or dataset? | ||
▲ | dr_dshiv 4 days ago | parent | prev [-] | |
Strong claim there! |