Remix.run Logo
highfrequency a day ago

> posed a fascinating question: How can a relatively modest embedding space of 12,288 dimensions (GPT-3) accommodate millions of distinct real-world concepts?

Because there is a large number of combinations of those 12k dimensions? You don’t need a whole dimension for “evil scientist” if you can have a high loading on “evil” and “scientist.” There is quickly a combinatorial explosion of expressible concepts.

I may be missing something but it doesn’t seem like we need any fancy math to resolve this puzzle.