| ▲ | Imustaskforhelp 2 days ago | |
Sure, I would like to have it. I know that Deepseek as a model is easier to have inference for, but I am not sure about how much pre-training as helped. It's my understand that GLM 5.1 or to my personal experience Kimi K2 are some nice open source models so I am interested to hear your thoughts on it and why you picked deepseek for the fine-tuning instead. | ||
| ▲ | gr00ve 2 days ago | parent [-] | |
Code is “HN5OFF” I picked Deepseek 3.2 because I was impressed with how they developed r1 and have continued to be satisfied with their capability improvements as well as algorithmic. I think they cut their cost in half recently because of the efficiency gains which was a big factor | ||