| ▲ | fibonacci112358 7 hours ago | |||||||||||||||||||||||||
Sadly for them, Nvidia didn't stay still in the meantime and created the next generation of CUDA, CuTile for Python and soon for C++, through CUDA Tile IR (using a similar compiler stack based on MLIR). Event though it's not portable, it will likely have far greater usage than Mojo just by being heavely promoted by Nvidia, integrated in dev tools and working alongside existing CUDA code. Tile IR was more likely a response to the threat of Triton rather than Mojo, at least from the pov of how easy is to write a decently performing LLM kernel. | ||||||||||||||||||||||||||
| ▲ | pjmlp 4 hours ago | parent | next [-] | |||||||||||||||||||||||||
And for not staying behind, Intel and AMD are doing similar efforts, and then we have the whole CPython JIT finally happening after so many attempts. Not to mention efforts like GraalPy and PyPy. And all these efforts work today in Windows, which is quite relevant in companies where that is the assigned device to most employees, even if the servers run Linux distros. I keep wondering if this isn't going to be another Swift for Tensorflow kind of outcome. | ||||||||||||||||||||||||||
| ▲ | melodyogonna 6 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
People keep mistaking Mojo as good syntax for writing GPU code, and so imagine Nvidia's Python frameworks already do that. But... would CuTile work on AMD GPUs and Apple Silicon? Whatever Nvidia does will still have vendor lock-in. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
| ▲ | brcmthrowaway 7 hours ago | parent | prev [-] | |||||||||||||||||||||||||
Interesting, how big impact is CuTile? | ||||||||||||||||||||||||||