▲ | sbinnee 9 hours ago | ||||||||||||||||||||||||||||||||||||||||||||||
> What’s improved? Language consistency: fewer CN/EN mix-ups & no more random chars. It's good that they made this improvement. But is there any advantages at this point using DeepSeek over Qwen? | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | twotwotwo 4 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
The fast Cerebras thing got me to try the Qwen3 models. I couldn't get them working all that well: they had trouble using the required output format and following instructions. On the other hand, benchmarks say they should be great, and it sounds like maybe some people use them OK via different tools. I'm curious if my experience was unusual (it very much could be!) and I'd be interested to hear from anyone who's used both. | |||||||||||||||||||||||||||||||||||||||||||||||
▲ | IgorPartola 8 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
I wish there was some easy resource to keep up with the latest models. The best I have come up with so far is asking one model to research the others. Realistically I want to know latest versions, best use case, performance (in terms of speed) relative to some baseline, and hardware requirements to run it. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | comrade1234 8 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||
MIT license that lets you run it on your own hardware and make money off of it. | |||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||
▲ | coder543 7 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||
They seem fairly competitive with each other. You would have to benchmark them for your specific use case. |