Remix clone Hacker News

new | show | ask | jobs Github

	▲	danielvaughn 21 hours ago
		I'm not an expert in LLMs, but I don't think character length matters. Text is deterministically tokenized into byte sequences before being fed as context to the LLM, so in theory `mySuperLongVariableName` uses the same number of tokens as `a`. Happy to be corrected here.
	▲	fragmede 9 hours ago \| parent [-]
		Running it through https://platform.openai.com/tokenizer "mySuperLongVariableName" takes 5 tokens. "a", takes 1. mediumvarname is 3 though. "though" is 1.