▲ | nirw4nna 5 days ago | |
I'm currently working on DSC, a tensor library I wrote from scratch in C++ with a PyTorch-like API. Right now it works on both CPU and GPU (both AMD and NVIDIA) and is capable of running LLMs like Qwen, I'm currently implementing a native profiler to trace CPU and GPU kernels and then I'll work on speed. Goal is to be competitive with PyTorch eager by the end of the year. Source code: https://github.com/nirw4nna/dsc My original HN post: https://news.ycombinator.com/item?id=44310678 |