Remix clone Hacker News

new | show | ask | jobs Github

	▲	fooofw 17 hours ago
		The tokenization can represent uncommon words with multiple tokens. Inputting your example on https://platform.openai.com/tokenizer (GPT-4o) gives me (tokens separated by "\|"): `lower\|case\|un\|se\|parated\|name`