Remix.run Logo
weinzierl 4 days ago

Yes, exactly, human languages are complex and in my opinion Unicode used to be on a good track to tackle these complexities. I just think that nowadays they are not doing enough to help people around the world solving these problems.

pas 4 days ago | parent [-]

can you describe a few examples? what are you missing? or maybe are you aware of something they rejected that would be useful?

weinzierl 4 days ago | parent [-]

The elephant in the room is Han Unification but there are plenty of other issues. Here is one of my favourites from another thread just two days ago.

https://news.ycombinator.com/item?id=44971254

This is the rejected proposal.

https://www.unicode.org/L2/L2003/03215-n2593-umlaut-trema.pd...

If you read thread from above you will find more examples from other people.

pas 3 days ago | parent [-]

thanks! very interesting!

ah, and now I understand what the hell people mean when they put dots on coordinate! (but they are obviously wrong they should use the flying point from Catalan :)

... hm, so this issue is easily more than 20 years old. and since then there's no solution (or the German libraries consider the problem "solved" and ... no one else is making proposals to the WG about this nowadays)?

also, technically - since there are already more than 150K allocated code points - adding a different combining mark seems the correct way to do, right?

or it's now universally accepted that people who want to type ambigüité need to remember to type U+034F before the ü? (... or, of course it's up to their editor/typesetter software to offer this distinction)

regarding the Han unification, is there some kind of effort to "fix" that? (adding language-start language-end markers perhaps? or virtual code points for languages to avoid the need for searching strings for the being-end markers?)