Remix.run Logo
ants_everywhere 5 days ago

It depends on how you define the math involved.

Locally it's all just linear algebra with an occasional nonlinear function. That is all straightforward. And by straightforward I mean you'd cover it in an undergrad engineering class -- you don't need to be a math major or anything.

Similarly CPUs are composed of simple logic operations that are each easy to understand. I'm willing to believe that designing a CPU requires more math than understanding the operations. Similarly I'd believe that designing an LLM could require more math. Although in practice I haven't seen any difficult math in LLM research papers yet. It's mostly trial and error and the above linear algebra.

apwell23 5 days ago | parent [-]

yea i would love to see what complicated math all this came out of. I thought rigorous math was actually an impediment to AI progress. Did any math actually predict or prove that scaling data would create current AI ?

ants_everywhere 5 days ago | parent [-]

I was thinking more about the everyday use of more advanced math to solve "boring" engineering challenges. Like finite math to layout chips or kernels. Or improvement to Strassen's algorithm for matrix multiplication. Or improving the transformer KV cache etc.

The math you would use to, for example, prove that search algorithm is optimal will generally be harder than the math needed to understand the search algorithm itself.