Remix.run Logo
Seb-C a day ago

> I'm curious if anyone can think of any other non-alphabetic characters used in legal names around the world, in other scripts?

Latin characters are NOT allowed in official names for Japanese citizens. It must be written in Japanese characters only.

For foreigners living in Japan it's quite frequent to end up in a situation where their official name in Latin does not pass the validation rules of many forms online. Issues like forbidden characters, or because it's too long since Japanese names (family name + first name) are typically only 4 characters long.

Also, when you get a visa to Japan, you have to bend and disform the pronunciation of your name to make it fit into the (limited) Japanese syllabary.

Funnily, they even had to register a whole new unicode range at some point, because old administrative documents sometimes contains characters that have been deprecated more than a century ago.

https://ccjktype.fonts.adobe.com/2016/11/hentaigana.html

crazygringo a day ago | parent [-]

Very interesting about Japan!

To be clear, I wasn't thinking about within a specific country though.

More like, what is the set of all characters that are allowed in legal names across the world?

You know, to eliminate things like emoji, mathematical symbols, and so forth.

Seb-C a day ago | parent [-]

Ah, I see.

I don't know, but I would bet that the sum of all corner cases and exceptions in the world would make it pretty hard to confidently eliminate any "obvious" characters.

From a technical standpoint, unicode emojis are probably safe to exclude, but on the other hand, some scripts like Chinese characters are fundamentally pictograms, which is semantically not so different than an emoji.

Maybe after centuries of evolution we will end up with a legit universal language based on emojis, and people named with it.

crazygringo a day ago | parent [-]

Chinese characters are nothing like emoji. They are more akin to syllables. There is no semantic similarity to emoji at all, even if they were originally derived from pictorial representations.

And they belong to the {Alphabetic} Unicode class.

I'm mostly curious if Unicode character classes have already done all the hard work.