| ▲ | catgary 3 days ago | |
I think this is an interesting direction, but I think that step 2 of this would be to formulate some conjectures about the geometry of other LLMs, or testable hypotheses about how information flows wrt character counting. Even checking some intermediate training weights of Haiku would be interesting, so they’d still be working off of the same architecture. The biology metaphor they make is interesting, because I think a biologist would be the first to tell you that you need more than one datapoint. | ||