| |
| ▲ | dragonwriter 14 days ago | parent | next [-] | | The mistake you seem to be making is confusing the existing product (which has been available for many years) with the upcoming new features for that product just announced at GTC, which are not addressed at all on the page for the existing product, but are addressed in the article about the GTC announcement. | | |
| ▲ | almostgotcaught 14 days ago | parent [-] | | > The mistake you seem to be making is confusing the existing product i'm not making any such mistake - i'm just able to actually read and comprehend what i'm reading rather than perform hype: > Over the last year, NVIDIA made CUDA Core, which Jones said is a “Pythonic reimagining of the CUDA runtime to be naturally and natively Python.” so the article is about cuda-core, not whatever you think it's about - so i'm responding directly to what the article is about. > CUDA Core has the execution flow of Python, which is fully in process and leans heavily into JIT compilation. this is bullshit/hype about Python's new JIT which womp womp womp isn't all that great (yet). this has absolutely nothing to do with any other JIT e.g., the cutile kernel driver JIT (which also has absolutely nothing to do with what you think it does). | | |
| ▲ | dragonwriter 14 days ago | parent | next [-] | | > i'm just able to actually read and comprehend what i'm reading rather than perform hype: The evidence of that is lacking. > so the article is about cuda-core, not whatever you think it's about cuda.core (a relatively new, rapidly developing, library whose entire API is experimental) is one of several things (NVMath is another) mentioned in the article, but the newer and as yet unreleased piece mentioned in the article and the GTC announcement, and a key part of the “Native Python” in the headline, is the CuTile model [0]: “The new programming model, called CuTile interface, is being developed first for Pythonic CUDA with an extension for C++ CUDA coming later.” > this is bullshit/hype about Python's new JIT No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code. [0] The article only has fairly vague qualitative description of what CuTile is, but (without having to watch the whole talk from GTC), one could look at this tweet for a preview of what the Python code using the model is expected to look like when it is released: https://x.com/blelbach/status/1902113767066103949?t=uihk0M8V... | | |
| ▲ | almostgotcaught 14 days ago | parent [-] | | > No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code. my guy what i am able to read, which you are not, is the source and release notes. i do not need to read tweets and press releases because i know what these things actually are. here are the release notes > Support Python 3.13 > Add bindings for nvJitLink (requires nvJitLink from CUDA 12.3 or above) > Add optional dependencies on CUDA NVRTC and nvJitLink wheels https://nvidia.github.io/cuda-python/latest/release/12.8.0-n... do you understand what "bindings" and "optional dependencies on..." means? it means there's nothing happening in this library and these are... just bindings to existing libraries. specifically that means you cannot jit python using this thing (except via the python 3.13 jit interpreter) and can only do what you've always already been able to do with eg cupy (compile and run C/C++ CUDA code). EDIT: y'all realize that 1. calling a compiler for your entire source file 2. loading and running that compiled code is not at all a JIT? y'all understand that right? | | |
| ▲ | squeaky-clean 14 days ago | parent | next [-] | | > my guy what i am able to read, which you are not, is the source and release notes. i do not need to read tweets and press releases because i know what these things actually are. here are the release notes Those aren't the release notes for the native python thing being announced. CuTile has not been publicly released yet. Based on what the devs are saying on Twitter it probably won't be released before the SciPy 2025 conference in July. | |
| ▲ | musicale 13 days ago | parent | prev | next [-] | | JIT as an adjective means just-in-time, as opposed to AOT, ahead-of-time. What Nvidia discussed at GTC was a software stack that will enable you to generate new CUDA kernels dynamically at runtime using Python API calls. It is a just-in-time (runtime, dynamic) compiler system rather than an ahead-of-time (pre-runtime, static) compiler. | |
| ▲ | saagarjha 14 days ago | parent | prev | next [-] | | cuTile is basically Nvidia’s Triton (no, not that Triton, OpenAI’s Triton) competitor. It takes your Python code and generates kernels at runtime. CUTLASS has a new Python interface that does the same thing. | |
| ▲ | wahnfrieden 14 days ago | parent | prev [-] | | [flagged] |
|
| |
| ▲ | squeaky-clean 14 days ago | parent | prev [-] | | Isn't the main announcement of the article CuTile? Which has not been released yet. Also the cuda-core JIT stuff has nothing to do with Python's new JIT, it's referring to integrating nvJitLink with python, which you can see an example of in cuda_core/examples/jit_lto_fractal.py |
|
| |
| ▲ | ashvardanian 14 days ago | parent | prev | next [-] | | In case someone is looking for some performance examples & testimonials, even on RTX 3090 vs a 64-core AMD Epy/Threadripper, even a couple of years ago, CuPy was a blast. I have a couple of recorded sessions with roughly identical slides/numbers: - San Francisco Python meetup in 2023: https://youtu.be/L9ELuU3GeNc?si=TOp8lARr7rP4cYaw
- Yerevan PyData meetup in 2022: https://youtu.be/OxAKSVuW2Yk?si=5s_G0hm7FvFHXx0u
Of the more remarkable results: - 1000x sorting speedup switching from NumPy to CuPy.
- 50x performance improvements switching from Pandas to CuDF on the New York Taxi Rides queries.
- 20x GEMM speedup switching from NumPy to CuPy.
CuGraph is also definitely worth checking out. At that time, Intel wasn't in as bad of a position as they are now and was trying to push Modin, but the difference in performance and quality of implementation was mind-boggling. | |
| ▲ | ladberg 14 days ago | parent | prev | next [-] | | The main release highlighted by the article is cuTile which is certainly about jitting kernels from Python code | | |
| ▲ | almostgotcaught 14 days ago | parent [-] | | > main release there is no release of cutile (yet). so the only substantive thing that the article can be describing is cuda-core - which it does describe and is a recent/new addition to the ecosystem. man i can't fathom glazing a random blog this hard just because it's tangentially related to some other thing (NV GPUs) that clearly people only vaguely understand. | | |
| ▲ | throwaway314155 12 days ago | parent [-] | | christ man lighten the fuck up. there's zero need to be _so_ god damn patronizing and disrespectful. |
|
| |
| ▲ | yieldcrv 14 days ago | parent | prev [-] | | I just want to see benchmarks. is this new one faster than CuPy or not |
|