▲ | btown 14 days ago | ||||||||||||||||
The GTC 2025 announcement session that's mentioned in this article has video here: https://www.nvidia.com/en-us/on-demand/session/gtc25-s72383/ It's a holistic approach to all levels of the stack, from high-level frameworks to low-level bindings, some of which is highlighting existing libraries, and some of which are completely newly announced. One of the big things seems to be a brand new Tile IR, at the level of PTX and supported with a driver level JIT compiler, and designed for Python-first semantics via a new cuTile library. https://x.com/JokerEph/status/1902758983116657112 (without login: https://xcancel.com/JokerEph/status/1902758983116657112 ) Example of proposed syntax: https://pbs.twimg.com/media/GmWqYiXa8AAdrl3?format=jpg&name=... Really exciting stuff, though with the new IR it further widens the gap that projects like https://github.com/vosen/ZLUDA and AMD's own tooling are trying to bridge. But vendor lock-in isn't something we can complain about when it arises from the vendor continuing to push the boundaries of developer experience. | |||||||||||||||||
▲ | skavi 14 days ago | parent [-] | ||||||||||||||||
i’m curious what advantage is derived from this existing independently of the PTX stack? i.e. why doesn’t cuTile produce PTX via a bundled compiler like Triton or (iirc) Warp? Even if there is some impedance mismatch, could PTX itself not have been updated? | |||||||||||||||||
|