| ▲ | mark_l_watson 5 hours ago | |||||||
Fine, I guess. The only commercial API I use to any great extent is gemini-3-flash-preview: cheap, fast, great for tool use and with agentic libraries. The 3.1-pro-preview is great, I suppose, for people who need it. Off topic, but I like to run small models on my own hardware, and some small models are now very good for tool use and with agentic libraries - it just takes a little more work to get good results. | ||||||||
| ▲ | throwaway2027 4 hours ago | parent | next [-] | |||||||
Seconded. Gemini used to be trash and I used Claude and Codex a lot but gemini-3-flash-preview punches above it's weight, it's decent and I rarely if ever run into any token limit either. | ||||||||
| ||||||||
| ▲ | PlatoIsADisease 4 hours ago | parent | prev | next [-] | |||||||
What models are you running locally? Just curious. I am mostly restricted to 7-9B. I still like ancient early llama because its pretty unrestricted without having to use an abliteration. | ||||||||
| ||||||||
| ▲ | nurettin 4 hours ago | parent | prev [-] | |||||||
I like to ask claude how to prompt smaller models for the given task. With one prompt it was able to make a low quantized model call multiple functions via json. | ||||||||