> On paper, this looks like a success. In practice, the time spent crafting a prompt, waiting for the AI to run and fixing the small issue that came up immensely exceeds the 10 minutes it would have taken me to edit the file myself. I don’t think coding that way would lead me to a massive performance improvement for now.

The models used in this experiment - deepseek-r1:8b, mistral:7b, qwen3:8b - are tiny. It's honestly a miracle that they produce anything that looks like working code at all!

I'm not surprised that the conclusion was that writing without LLM assistance would be more productive in this case.

▲

giantrobot 4 days ago | parent | next [-]

Yeah those small models can work for SillyTavern or some basic rubber ducking but they're not nearly large enough for coding. I have had no luck coding with anything smaller than 30b models. I've found 13b models to be not terrible for boilerplate and code completion. But 8b seems way too dumb for the task.

▲

troyvit 3 days ago | parent | prev | next [-]

Weird how this story came out a ~few hours later~ at about the same time: https://news.ycombinator.com/item?id=44723316

That isn't an open source model, but a quantized version of GLM-4.5, an open-weight model. I'd say there's hope yet for small, powerful open models.

▲

mattmanser 4 days ago | parent | prev [-]

Yeah, the truth is avoiding the big players is silly right now. It's not the small models won't eventually work either, we have no idea how they can get compressed in future. Especially with people trying to get the mixture of experts approach working.

Right now, you need the bigger models for good responses, but in a year's time?

So the whole exercise was a bit of a waste of his time, the present target moves too quickly. This isn't a time to be clutching your pearls about running your own models unless you want to do something shady with AI.

And like video streaming was progressed by the porn industry, a lot of people are watching the, um, "thirsty" AI enthusiasts for the big advances in small models.

	▲	Mars008 4 days ago \| parent [-]
		That's too simplified IMHO. Local models can do a lot. Like sorting texts, annotating images, text-speech, speech-text. It's much cheaper when it works. Software development is not in the list because the quality of output defines the time developers spend prompting and fixing. It's just faster and cheaper to use big model.