Remix clone Hacker News

new | show | ask | jobs Github

	▲	sync 4 days ago
		I'm doing coreference resolution and this model (w/o thinking) performs at the Gemini 2.5-Pro level (w/ thinking_budget set to -1) at a fraction of the cost.
	▲	antman 3 days ago \| parent \| next [-]
		Nice point. How did you test for coreference resolution? Specific prompt or dataset?
	▲	dr_dshiv 4 days ago \| parent \| prev [-]
		Strong claim there!