Remix.run Logo
selinkocalar 19 hours ago

The memory safety guarantees in Rust are probably useful here given how easy it is to have buffer overflows in transformer implementations. CUDA kernels are still going to dominate performance though. Curious about the tokenization approach - are you implementing BPE from scratch too or using an existing library?