| ▲ | verdverm a day ago |
| I'd second this wholeheartedly Since building a custom agent setup to replace copilot, adopting/adjusting Claude Code prompts, and giving it basic tools, gemini-3-flash is my go-to model unless I know it's a big and involved task. The model is really good at 1/10 the cost of pro, super fast by comparison, and some basic a/b testing shows little to no difference in output on the majority of tasks I used Cut all my subs, spend less money, don't get rate limited |
|
| ▲ | dpoloncsak a day ago | parent | next [-] |
| Yeah, one of my first projects one of my buddies asked "Why aren't you using [ChatGPT 4.0] nano? It's 99% the effectiveness with 10% the price." I've been using the smaller models ever since. Nano/mini, flash, etc. |
| |
| ▲ | sixtyj a day ago | parent | next [-] | | Yup. I have found out recently that Grok-4.1-fast has similar pricing (in cents) but 10x larger context window (2M tokens instead of ~128-200k of gpt-4-1-nano). And ~4% hallucination, lowest in blind tests in LLM arena. | | |
| ▲ | verdverm a day ago | parent [-] | | [flagged] | | |
| ▲ | rudhdb773b a day ago | parent [-] | | Grok is the best general purpose LLM in my experience. Only Gemini is comparable. It would be silly to ignore it, and xAI is less evil than Google these days. | | |
|
| |
| ▲ | phainopepla2 a day ago | parent | prev | next [-] | | I have been benchmarking many of my use cases, and the GPT Nano models have fallen completely flat one every single except for very short summaries. I would call them 25% effectiveness at best. | | |
| ▲ | verdverm a day ago | parent [-] | | Flash is not a small model, it's still over 1T parameters. It's a hyper MoE aiui I have yet to go back to small models, waiting for the upstream feature / GPU provider has been seeing capacity issues, so I am sticking with the gemini family for now |
| |
| ▲ | walthamstow a day ago | parent | prev [-] | | Flash Lite 2.5 is an unbelievably good model for the price |
|
|
| ▲ | r_lee a day ago | parent | prev | next [-] |
| Plus I've found that overall with "thinking" models, it's more like for memory, not even actual perf boost, it might even be worse because if it goes even slightly wrong on the "thinking" part, it'll then commit to that for the actual response |
| |
| ▲ | verdverm a day ago | parent [-] | | for sure, the difference in the most recent model generations makes them far more useful for many daily tasks. This is the first gen with thinking as a significant mid-training focus and it shows gemini-3-flash stands well above gemini-2.5-pro |
|
|
| ▲ | PunchyHamster 3 hours ago | parent | prev | next [-] |
| LLM bubble will burst the second investors figure out how much well managed local model can do |
|
| ▲ | dingnuts a day ago | parent | prev [-] |
| [dead] |