| ▲ | warangal 4 hours ago | |||||||||||||||||||||||||||||||
I may be wrong here, but blog-post seems AI written, with repetition of sequences like "the inference pipeline was rebuilt using architecture-aware fused kernels, optimized scheduling, and dis-aggregated serving". I don't know what that means without some code and proper context. Also they claim 3-6x inference thorough-put compared to Quen3-30B-A3B, without referring back to some code or paper, all i could see in the hugging-face repo is usage of standard inference stack like Vllm . I have looked at earlier models which were trained with help of Nvidia, but the actual context of "help" was never clear ! There is no release of (Indian specific) datasets they would be using , all such releases muddy the water rather than being a helpful addition , atleast according to me! | ||||||||||||||||||||||||||||||||
| ▲ | simianwords 2 hours ago | parent [-] | |||||||||||||||||||||||||||||||
Disagree, the post makes punctuation mistakes that only an Indian can make. So does your own comment. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||