| ▲ | erichocean 2 days ago | |
Could be, by "meaning" I mean (heh) that transformers are able to distinguish tokens (and prompts) in a consequential ("causal") way, and that they do so at various levels of detail ("abstractions"). I think that's the usual understanding of how transformer architectures work, at the level of math. | ||