Remix.run Logo
steve-atx-7600 2 days ago

Inference from an LLM is O(tokens^2)

halJordan 2 days ago | parent [-]

Only in the naive implementations of attention