> The mistake you seem to be making is confusing the existing product

i'm not making any such mistake - i'm just able to actually read and comprehend what i'm reading rather than perform hype:

> Over the last year, NVIDIA made CUDA Core, which Jones said is a “Pythonic reimagining of the CUDA runtime to be naturally and natively Python.”

so the article is about cuda-core, not whatever you think it's about - so i'm responding directly to what the article is about.

> CUDA Core has the execution flow of Python, which is fully in process and leans heavily into JIT compilation.

this is bullshit/hype about Python's new JIT which womp womp womp isn't all that great (yet). this has absolutely nothing to do with any other JIT e.g., the cutile kernel driver JIT (which also has absolutely nothing to do with what you think it does).

▲

dragonwriter 5 months ago | parent | next [-]

> i'm just able to actually read and comprehend what i'm reading rather than perform hype:

The evidence of that is lacking.

> so the article is about cuda-core, not whatever you think it's about

cuda.core (a relatively new, rapidly developing, library whose entire API is experimental) is one of several things (NVMath is another) mentioned in the article, but the newer and as yet unreleased piece mentioned in the article and the GTC announcement, and a key part of the “Native Python” in the headline, is the CuTile model [0]:

“The new programming model, called CuTile interface, is being developed first for Pythonic CUDA with an extension for C++ CUDA coming later.”

> this is bullshit/hype about Python's new JIT

No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code.

[0] The article only has fairly vague qualitative description of what CuTile is, but (without having to watch the whole talk from GTC), one could look at this tweet for a preview of what the Python code using the model is expected to look like when it is released: https://x.com/blelbach/status/1902113767066103949?t=uihk0M8V...

▲

almostgotcaught 5 months ago | parent [-]

> No, as is is fairly explicit in the next line after the one you quote, it is about the Nvidia CUDA Python toolchain using in-process compilation rather than relying on shelling out to out-of-process command-line compilers for CUDA code.

my guy what i am able to read, which you are not, is the source and release notes. i do not need to read tweets and press releases because i know what these things actually are. here are the release notes

> Support Python 3.13

> Add bindings for nvJitLink (requires nvJitLink from CUDA 12.3 or above)

> Add optional dependencies on CUDA NVRTC and nvJitLink wheels

https://nvidia.github.io/cuda-python/latest/release/12.8.0-n...

do you understand what "bindings" and "optional dependencies on..." means? it means there's nothing happening in this library and these are... just bindings to existing libraries. specifically that means you cannot jit python using this thing (except via the python 3.13 jit interpreter) and can only do what you've always already been able to do with eg cupy (compile and run C/C++ CUDA code).

EDIT: y'all realize that

1. calling a compiler for your entire source file

2. loading and running that compiled code

is not at all a JIT? y'all understand that right?

	▲	squeaky-clean 5 months ago \| parent \| next [-]
		> my guy what i am able to read, which you are not, is the source and release notes. i do not need to read tweets and press releases because i know what these things actually are. here are the release notes Those aren't the release notes for the native python thing being announced. CuTile has not been publicly released yet. Based on what the devs are saying on Twitter it probably won't be released before the SciPy 2025 conference in July.
	▲	musicale 5 months ago \| parent \| prev \| next [-]
		JIT as an adjective means just-in-time, as opposed to AOT, ahead-of-time. What Nvidia discussed at GTC was a software stack that will enable you to generate new CUDA kernels dynamically at runtime using Python API calls. It is a just-in-time (runtime, dynamic) compiler system rather than an ahead-of-time (pre-runtime, static) compiler.
	▲	saagarjha 5 months ago \| parent \| prev \| next [-]
		cuTile is basically Nvidia’s Triton (no, not that Triton, OpenAI’s Triton) competitor. It takes your Python code and generates kernels at runtime. CUTLASS has a new Python interface that does the same thing.
	▲	wahnfrieden 5 months ago \| parent \| prev [-]
		[flagged]

▲

squeaky-clean 5 months ago | parent | prev [-]

Isn't the main announcement of the article CuTile? Which has not been released yet.

Also the cuda-core JIT stuff has nothing to do with Python's new JIT, it's referring to integrating nvJitLink with python, which you can see an example of in cuda_core/examples/jit_lto_fractal.py