Remix.run Logo
jauntywundrkind 15 hours ago

Will be interesting to see if Nvidia and other have any interest & energy getting this used by others, if there actually is an ecosystem forming around it.

Google leading XLA & IREE, with awesome intermediate representations, used by lots of hardware platforms, and backing really excellent Jax & Pytorch implementations, having tools for layout & optinization folks can share: they really build an amazing community.

There's still so much room for planning/scheduling, so much hardware we have yet to target. RISC-V has really interesting vector instructions, for example, and it seems like there's so much exploration / work to do to better leverage that.

Nvidia has partners everywhere now. Nvlink is used by Intel, AWS Tritanium, others. Yesterday the Groq exclusive license that Nvidia paid to give to Groq?! Seeing how and when CUDA Tiles emerges: will be interesting. Moving from fabric partnerships, up up up the stack.

pjmlp 14 hours ago | parent | next [-]

For NVidia it suffices this is a Python JIT allowing programming CUDA compute kernels directly in Python instead of C++, yet another way how Intel and AMD, alongside Khronos APIs, lag behind in great developer experiences for GPU compute programming.

Ah, and Nsight debugging also supports Python CUDA Tiles debugging.

https://developer.nvidia.com/blog/simplify-gpu-programming-w...

saagarjha 8 hours ago | parent | next [-]

Nsight does not have a debugger.

dahart 3 hours ago | parent | next [-]

What do you mean? Are you unaware of Nsight VSE? https://developer.nvidia.com/nsight-visual-studio-edition

saagarjha an hour ago | parent [-]

I was aware of their Visual Studio plugins but I did not know that they called their debugger support for Visual Studio “Nsight” as well.

pjmlp 3 hours ago | parent | prev [-]

Yes it does, apparently you never used it.

Q6T46nT668w6i3m 13 hours ago | parent | prev [-]

Slang is a fantastic developer experience.

Conscat 9 hours ago | parent | next [-]

I work at Nvidia, and my team is using Slang for all of our (numerous and non-trivial) kernels because its automatic differentiation type system is so nice.

pjmlp 12 hours ago | parent | prev [-]

Especially when using the tooling from who created it, before offering it to Khronos as GLSL replacement, NVIDIA.

Moosdijk 14 hours ago | parent | prev | next [-]

> There's still so much room for planning/scheduling, so much hardware we have yet to target

this is nicely illustrated by this recent article:

https://news.ycombinator.com/item?id=46366998

saagarjha 8 hours ago | parent [-]

Wrong type of scheduling.

nl 6 hours ago | parent | prev | next [-]

> Groq exclusive license

non-exclusive license actually.

turtletontine 14 hours ago | parent | prev | next [-]

On the RISC-V vector instructions, could you elaborate? Are the vector extensions substantially different from those in ARM or x86?

adgjlsfhk1 14 hours ago | parent [-]

it's fairly similar to Arm's sve2, but very different from the x86 side in that the instructions are variable length rather than fixed

almostgotcaught 7 hours ago | parent | prev [-]

> Google leading XLA & IREE

IREE hasn't been at G for >2 years.