I may be wrong here, but blog-post seems AI written, with repetition of sequences like "the inference pipeline was rebuilt using architecture-aware fused kernels, optimized scheduling, and dis-aggregated serving". I don't know what that means without some code and proper context.

Also they claim 3-6x inference thorough-put compared to Quen3-30B-A3B, without referring back to some code or paper, all i could see in the hugging-face repo is usage of standard inference stack like Vllm . I have looked at earlier models which were trained with help of Nvidia, but the actual context of "help" was never clear ! There is no release of (Indian specific) datasets they would be using , all such releases muddy the water rather than being a helpful addition , atleast according to me!

▲

simianwords 2 hours ago | parent [-]

Disagree, the post makes punctuation mistakes that only an Indian can make. So does your own comment.

▲

an hour ago | parent | next [-]

[deleted]

▲

ACCount37 2 hours ago | parent | prev [-]

Not a given. We've already seen LLMs that got SFT'd by "national teams" adopt ESL speech patterns.

▲

simianwords 2 hours ago | parent [-]

They won’t make punctuation mistakes though.

	▲	Crespyl 39 minutes ago \| parent [-]
		Wouldn't they do exactly that if they were trained on enough text with punctuation mistakes?