Remix.run Logo
quc1k 4 hours ago

I really appreciate the focus on interpretability. Usually, super-optimizers give you a blob of assembly that runs fast but is impossible to debug or maintain. By forcing the model to output a natural language 'Plan' first, you essentially get documentation for free. If the code breaks later, you can look at the plan to understand why the loop was unrolled or why the memory was laid out that way. That makes this actually usable in a production CI/CD pipeline, unlike most black-box ML optimizations.

kap901 3 hours ago | parent [-]

manually writing tiling logic for systolic arrays is the absolute worst. if this actually works it saves me so much headache.