I wonder if it really needs to be worse. I am playing with the idea of fine tuning a model on my exact stack and coding patterns. I suspect I could get better performance by training “taste” into a model rather than breadth.

▲

andy_ppp 3 hours ago | parent | next [-]

Fine tuning these models (at least with PPO or equivalent) requires even more VRAM than inference does, potentially 2-3 times more.

▲

epicureanideal 3 hours ago | parent | prev [-]

I also wonder about JS only, Python only, etc models.

Maybe the future is a selection of local, specific stack trained models?

	▲	andy_ppp 3 hours ago \| parent [-]
		These models being able to generalise at coding will likely get worse if you remove high quality training data like all of python.