The context window is 16 characters. Talking about tokens per second is meaningless.
its not meaningless. there could be usecases like spell correction.
It is only interesting as an academic exercise in EDA design. Just like microGPT. For something with an n^2 complexity and advertising perf is clickbait.