Yeah, one of my first projects one of my buddies asked "Why aren't you using [ChatGPT 4.0] nano? It's 99% the effectiveness with 10% the price."

I've been using the smaller models ever since. Nano/mini, flash, etc.

▲

sixtyj a day ago | parent | next [-]

Yup.

I have found out recently that Grok-4.1-fast has similar pricing (in cents) but 10x larger context window (2M tokens instead of ~128-200k of gpt-4-1-nano). And ~4% hallucination, lowest in blind tests in LLM arena.

▲

verdverm a day ago | parent [-]

[flagged]

▲

rudhdb773b a day ago | parent [-]

Grok is the best general purpose LLM in my experience. Only Gemini is comparable. It would be silly to ignore it, and xAI is less evil than Google these days.

	▲	verdverm a day ago \| parent [-]
		[flagged]

▲

phainopepla2 a day ago | parent | prev | next [-]

I have been benchmarking many of my use cases, and the GPT Nano models have fallen completely flat one every single except for very short summaries. I would call them 25% effectiveness at best.

	▲	verdverm a day ago \| parent [-]
		Flash is not a small model, it's still over 1T parameters. It's a hyper MoE aiui I have yet to go back to small models, waiting for the upstream feature / GPU provider has been seeing capacity issues, so I am sticking with the gemini family for now

▲

walthamstow a day ago | parent | prev [-]

Flash Lite 2.5 is an unbelievably good model for the price