Remix clone Hacker News

new | show | ask | jobs Github

	▲	chuckadams 2 hours ago
		> It would have been expensive, but all characters should have been fixed size 64bit values. It would have been a non-starter, and then we'd all be dealing with Shift-JIS, BIG5, and FSM knows how many different codepages to this day. UTF-8 is about as elegant as it gets, though Java and JS still managed to fuck that up too (they both encode every codepoint outside the BMP as surrogate pairs in UTF-8)
	▲	chrismorgan an hour ago \| parent \| next [-]
		> Java and JS […] both encode every codepoint outside the BMP as surrogate pairs in UTF-8 I can’t comment on Java, but JS I know reasonably well and I can’t think of any place it uses CESU-8.
	▲	dasyatidprime an hour ago \| parent \| prev [-]
		That's called CESU-8. https://www.unicode.org/reports/tr26/tr26-4.html