The context window is 16 characters. Talking about tokens per second is meaningless.

its not meaningless. there could be usecases like spell correction.

	▲	genxy an hour ago \| parent [-]
		It is only interesting as an academic exercise in EDA design. Just like microGPT. For something with an n^2 complexity and advertising perf is clickbait.