Remix.run Logo
soVeryTired 4 hours ago

No way is vocab size zipfian. Word counts from a corpus follow zipf's law, but not vocab sizes themselves.

Otherwise the most common vocab size would be equal to one.