Remix.run Logo
Legend2440 3 days ago

They study it because it already has a known solution.

The point is to see how LLMs implement algorithms internally, starting with this simple easily understood algorithm.

catgary 3 days ago | parent | next [-]

I think this is an interesting direction, but I think that step 2 of this would be to formulate some conjectures about the geometry of other LLMs, or testable hypotheses about how information flows wrt character counting. Even checking some intermediate training weights of Haiku would be interesting, so they’d still be working off of the same architecture.

The biology metaphor they make is interesting, because I think a biologist would be the first to tell you that you need more than one datapoint.

Rygian 3 days ago | parent | prev [-]

That makes sense; however it does not seem like they check the LLM outputs against the known solution. Maybe I missed that in the article.