Remix clone Hacker News

new | show | ask | jobs Github

	▲	Aurornis 5 hours ago
		Cool visualization, but most of the token generation in my sessions doesn't go to output code or even the text I see. Reasoning tokens make up most of the output. That can only occur after processing the input files and context. For non-trivial work I go through hundreds of thousands of tokens (combined prefill + tg of course) before even getting to some useful text output. I mostly use LLMs for exploration and studies, rarely code generation. Prefill matters heavily for this. Even in the high hundreds or low thousands prefill rate I spend a lot of time waiting on the LLM (doing other things, not twiddling thumbs)