Remix.run Logo
Muromec a year ago

I can be called what I want and in fact I have perfectly reasonable name that doesn't fit neither ASCII nor FN+LN convention. The thing is, your website accepting whatever utf8 blob my name can be serialized to today, without actually understanding it, makes my life worse, not better.

hobs a year ago | parent [-]

No, it allows an exact representation of your name, it doesn't do anything to your life.

If you dont like your name, either change it or go complain to your parents. They might tell you that you cultural reference point is more important than some person being able to read your name off of a computer screen.

If you want to store a phonetic name for the destination speaker that's not a bad idea, but a name is a name is a name. It is your unique identifier, do not munge it.

Muromec a year ago | parent [-]

But it does affect my life in a way you refuse to understand. That's the problem -- there isn't a true canonical representation of a name (any name really) that fits all practical purposes. Storing a bag of bytes to display back to user is the easiest of practical purposes and suggesting the practice that solve that is worse than rejecting Stępień, it's refusal to understand complexities, that leads to eventually doing the wrong thing and failing your user without even telling them.

>It is your unique identifier, do not munge it.

It's not a good identifier either. Nobody uses names as identifiers at any scale that matters for computers. You can't assume they don't have collisions, you can't tell whether two bags of bytes identify the same person or two different, they aren't even immutable and sometimes are accidentally mutable.

soco a year ago | parent | next [-]

Then where is the problem? If the support can read Polish they will pronounce your name properly, if they're from India they will mess it up, why should we have different expectations? Nobody will identify you by name anyway, they will ask how to call you (chatbots do this already) and then use for proper identification all kind of ids and pins and whatnot. So we are talking here about a complexity that nobody actually needs, not even you. So let the name be saved and displayed in the nice native way, and you as programmer make sure you don't go Bobby Tables with the strings.

Muromec a year ago | parent [-]

>if they're from India they will mess it up

Or not able to read at all.

>Then where is the problem?

Since you don't indicate for what purpose my name is stored, which may actually be display only, any of the following can happen:

A name as entered in your system is compared to a name entered in a different system or when you interface (maybe indirectly and unknowingly) with a system using different constrains or a different script, maybe imposed by their jurisdiction. As a result, the intended operation does not come through.

This may happen in the indirect way and invisible to you -- e.g. you produce an artifact, say and invoice or issue a payment card using $script a, which I will only later figure out I can't use, because it's expected to be in $script b, or even worse be in $script a presumed to match $script b they have on record. One of the non-obvious ways it can fail, is when you try to determine whether two names in the same script are actually the same to infer family relationship or something other that you should not do anyway.

It may happen within your system in a way your CSR will deny is possible as well.

That's on a more severe side, which means I will not try to use the name in any rendering that doesn't match MRZ of my identity document. Which was probably the opposite of what you intended allowing arbitrary bag of bytes to be entered. No, that is not made up problem, because I'm bored, it's a thing.

On a less sever side, not understanding names is a failure in i18n department, because you can't support my language properly without understanding how my name should be changed when you address me, when you simply show it near user icon and when you describe relations between me and objects and people. If you can't do proper i18n and a different provider can, you may lose me as a customer, because your attitude is presumed to be "everyone can just use ASCII and English". Yes, people exist that actually get it right because they put an effort in this human aspect.

On a mildly annoying, but inconsequential side people also have a habit of trying to infer gender based on names despite having gender clearly marked in their system.

hobs a year ago | parent | next [-]

Managing the canonical representation of your name in my system is one of the few things you are responsible for.

The number of times I have had people ask me to customize name rendering, capitalize things, trying to build phonetic maps, all of these things to avoid data entry or confusion and all they do is prove out that you can't have a general solution to human names, you can hit a big percentage in a cultural context, but there's always exceptions and edge cases to the problem we're solving which can be described as "please tell me your name when you call or whatever so I can pronounce it right"

soco a year ago | parent | prev [-]

>Or not able to read at all.

"Hello, how should we address you?". Not everything must be done in code.

>when you interface (maybe indirectly and unknowingly) with a system using different constrains

I have yet to encounter a system recognizing assets and making automatic decisions based on name. It would fail already if the user switched first/last name.

>people exist that actually get it right

You could have started by explaining this right way and we'd be all smarter.

hobs a year ago | parent | prev [-]

There's no such thing as a data structure that fits "all practical purposes" that is correct.

There's no wrong thing - this is the best representation we can make given the system of record for the person's name.

They are definitely mutable, context dependent, and effectively data you cannot make assumptions about because of all those things.

If you want to do more than that you need a highly constrained use case, and its going to fail for "all practical purposes".