Respectfully, from my experience and a few billions of tokens consumed, some opensource models really are strong and useful. Specifically StepFun-3.5-flash https://github.com/stepfun-ai/Step-3.5-Flash

I'm working on a pretty complex Rust codebase right now, with hundreds of integration tests and nontrivial concurrency, and stepfun powers through.

I have no relation to stepfun, and I'm saying this purely from deep respect to the team that managed to pack this performance in 196B/11B active envelope.

▲

aappleby 2 hours ago | parent [-]

What are you running that model on?

▲

FuckButtons 37 minutes ago | parent | next [-]

A 3 bit quant will run on a 128gb MacBook Pro, it works pretty well.

	▲	nl 32 minutes ago \| parent [-]
		A 3 bit quant is quite a lot weaker than the OpenRouter version the OP is using.

▲

kir-gadjello 2 hours ago | parent | prev [-]

I just use openrouter, it's free for now. But I would pay 30-100$ to use it 24/7.

	▲	aappleby 2 hours ago \| parent [-]
		Ah, I thought you meant you were running it locally.