Remix clone Hacker News

new | show | ask | jobs Github

	▲	chickenzzzzu a day ago
		I do have an idea of how wrong I am. Nvidia's own people are the ones who have made Vulkan performance so close to CUDA's. AMD is behind, but the data shows that they're off in performance proportional to the cost of the device. If they implement coop mat 2, then they would bridge the gap. 99.9% of people who use Pytorch and so on could achieve good enough performance using a "simple vulkan backend" for whatever Python stuff they're used to writing. That would strip out millions of lines of code. The reason nobody has done this outside of a few github projects that Nvidia themselves have contributed to, is because there isn't a whole lot of money in iterative performance gains, when in reality better algorithmic approaches are being invented quite near every month or so.