Remix clone Hacker News

new | show | ask | jobs Github

	▲	xodn348 5 hours ago
		Really interesting approach — attacking token efficiency at the encoding level is more fundamental than what I did. Even without retraining BPE from scratch, starting with YUTF-8 and measuring how existing tokenizers handle it would already be a worthwhile experiment. Hope you find the time to build it, good luck!