Remix clone Hacker News

new | show | ask | jobs Github

	▲	dist-epoch 3 days ago
		it's not strictly a counting task, the LLM sees same-sized-tokens, but a token corresponds to a variable number of characters (which is not directly fed into the model) like the difference between Unicode code-points and UTF-8 bytes, you can't just count UTF-8 bytes to know how many code-points you have
	▲	omnicognate 3 days ago \| parent [-]
		There's an aspect of figuring out what to count, but that doesn't make this task visual/spatial in any sense I can make out.