Remix.run Logo
squeaky-clean 6 hours ago

The easiest and most robust way to deal with email is to have 2 fields. string email, bool isValidated. (And you'll need some additional way to handle a time based validation code). Accept the user's string, fire off an email to it and require them to click a validation link or enter a code somewhere.

Email is weird and ultimately the only decider of a valid email is "can I send email to this address and get confirmation of receipt".

If it's a consumer website you can so some clientside validation of ".@.\\..*" to catch easy typos. That will end up rejecting a super small amount of users but they can usually deal with it. Validating against known good email domains and whatnot will just create a mess.

lock1 5 hours ago | parent | next [-]

In the spirit of "Parse, Don't Validate", rather than encode "validation" information as a boolean to be checked at runtime, you can define `Email { raw: String }` and hide the constructor behind a "factory function" that accepts any string but returns `Option<Email>` or `Result<Email,ParseError>`.

If you need a stronger guarantee than just a "string that passes simple email regex", create another "newtype" that parses the `Email` type further into `ValidatedEmail { raw: String, validationTime: DateTime }`.

While it does add some "boilerplate-y" code no matter what kind of syntactical sugar is available in the language of your choice, this approach utilizes the type system to enforce the "pass only non-malformed & working email" rule when `ValidatedEmail` type pops up without constantly remembering to check `email.isValidated`.

This approach's benefit varies depending on programming languages and what you are trying to do. Some languages offer 0-runtime cost, like Haskell's `newtype` or Rust's `repr(transparent)`, others carry non-negligible runtime overhead. Even then, it depends on whether the overhead is acceptable or not in exchange for "correctness".

squeaky-clean 2 hours ago | parent [-]

I would still usually prefer email as just a string and validation as a separate property, and they both belong to some other object. Unless you really only want to know if XYZ email exists, it's usually something more like "has it been validated that ABC user can receive email at XYZ address".

Is the user account validated? Send an email to their email string. Is it not validated? Then why are we even at a point in the code where we're considering emailing the user, except to validate the email.

You can use similar logic to what you described, but instead with something like User and ValidatedUser. I just don't think there's much benefit to doing it with specifically the email field and turning email into an object. Because in those examples you can have a User whose email property is a ParseError and you still end up having to check "is the email property result for this user type Email or type ParseError?" and it's very similar to just checking a validation bool except it's hiding what's actually going on.

mejutoco 4 hours ago | parent | prev [-]

My preferred solution would be:

You have 2 types

UnvalidatedEmail

ValidatedEmail

Then ValidatedEmail is only created in the function that does the validation: a function that takes an UnvalidatedEmail and returns a ValidatedEmail or an error object.

squeaky-clean 4 hours ago | parent [-]

That can work in some situations. One thing I won't like about it in some other situations is that you now have 2 nullable fields associated with your user, or whatever that email is associated with. It's annoying or even impossible in a lot of systems to have a guaranteed validation that user.UnvalidatedEmail or user.ValidatedEmail must exist but not both.

mejutoco 2 hours ago | parent [-]

I see. In my example they would be just types and internally a newtype string.

So an object could have a field

email: UnvalidatedEmail | ValidatedEmail

Nothing would be nullable there in that case. You could match on the type and break if not all cases are handled.