It's just a long winded way of saying "tied embeddings"[1]. IIRC, GPT-2, BERT, Gemma 2, Gemma 3, some of the smaller Qwen models and many more architectures use weight tied input/output embeddings.
[1]: https://arxiv.org/abs/1608.05859