▲ | almostgotcaught 14 days ago | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
it's funny - people around here really do not have a clue about the GPU ecosystem even though everyone is always talking about AI: > The article is about the next wave of Python-oriented JIT toolchains the article is content marketing (for whatever) but the actual product has literally has nothing to do with kernels or jitting or anything https://github.com/NVIDIA/cuda-python literally just cython bindings to CUDA runtime and CUB. for once CUDA is aping ROCm: | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | dragonwriter 14 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The mistake you seem to be making is confusing the existing product (which has been available for many years) with the upcoming new features for that product just announced at GTC, which are not addressed at all on the page for the existing product, but are addressed in the article about the GTC announcement. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ashvardanian 14 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
In case someone is looking for some performance examples & testimonials, even on RTX 3090 vs a 64-core AMD Epy/Threadripper, even a couple of years ago, CuPy was a blast. I have a couple of recorded sessions with roughly identical slides/numbers:
Of the more remarkable results:
CuGraph is also definitely worth checking out. At that time, Intel wasn't in as bad of a position as they are now and was trying to push Modin, but the difference in performance and quality of implementation was mind-boggling. | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | ladberg 14 days ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
The main release highlighted by the article is cuTile which is certainly about jitting kernels from Python code | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
▲ | yieldcrv 14 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
I just want to see benchmarks. is this new one faster than CuPy or not |