Check out the GLM models, they are excellent
Minimax m2.1 rivals GLM 4.7 and fits in 128GB with 100k context at 3bit quantization.