Quant choice depends on your vram, use case, need for speed, etc. For coding I would not go below Q4_K_M (though for Q4, unsloth XL or ik_llama IQ quants are usually better at the same size). Preferably Q5 or even Q6.