Remix.run Logo
latemedium 2 days ago

I think part of the reason why just a few people write custom CUDA / triton kernels is that it's really hard to do well. Languages like Mojo aim to make that much easier, and so hopefully more people will be able to write them (and do other interesting things with GPUs that are too technically challenging right now)

dogma1138 a day ago | parent [-]

The only question will there be a benefit in writing your own kernels in something like Mojo than to skip that part altogether and use the primitives with already highly optimized kernels that frameworks like torch provide especially when it comes to performance.