Remix.run Logo
petalmind 3 hours ago

Imperative mood "normalize" assumes that you had something not-normalized before you received that instruction. It's not useful when your table design strategy is already normalization-preserving, such as the most basic textbook strategy (a table per anchor, a column per attribute or 1:N link, a 2-column table per M:N link).

And this is basically the main point of my critique of 4NF and 5NF. They both traditionally present an unexplained table that is supposed to be normalized. But it's not clear where does this original structure come from. Why are its own authors not aware about the (arguably, quite simple) concept of normalization?

It's like saying that to in order to implement an algorithm you have to remove bugs from its original implementation — where does this implementation come from?

The other side of this coin is that lots of real-world design have a lot of denormalized representations that are often reasonably-well engineered.

Because of that if you, as a novice, look at a typical production schema, and you have this "thou shalt normalize" instruction, you'll be confused.

This is my big teaching pet peeve.

Quarrelsome 3 hours ago | parent [-]

> But it's not clear where does this original structure come from. Why are its own authors not aware about the (arguably, quite simple) concept of normalization?

I find the bafflement expressed in the article as well as the one linked extremely attractive. It made both a joy to read.

Were I to hazard a guess: Might it be a consequence of lack of disk space in those early decades, resulting into developers being cautious about defining new tables and failing to rationalise that the duplication in their tragic designs would result in more space wasted?

> The other side of this coin is that lots of real-world design have a lot of denormalized representations that are often reasonably-well engineered.

Agreed, but as the OP comment stated they usually started out normalised and then pushed out denormalised representations for nice contiguous reads.

As a victim of maintaining a stack on top of an EAV schema once upon a time, I have great appreciation for contiguous reads.

petalmind 3 hours ago | parent [-]

> Might it be a consequence of lack of disk space in those early decades

A plausible explanation of "normalization as a process" was actually found in https://www.cargocultcode.com/normalization-is-not-a-process... ("So where did it begin?").

I hope someday to find some technical report of migrating to the relational database, from around that time.

Quarrelsome 3 hours ago | parent [-]

> Normalization-as-process makes sense in a specific scenario: When converting a hierarchical database model into a relational model.

That makes much more sense as reasoning.

If I can also offer a second hazard of guess. I used to work in embedded in the 2000's and it was absolutely insane how almost all of the eldy architects and developers would readily accept some fixed width file format for data storage over a sensible solution that offered out of the box transactionality and relational modelling like Sqlite. This creates a mindset where each datastore is effectively siloed and must contain all the information to perform the operation, potentially leading to these denormalised designs.

Bit weird, given that was from the waterfall era, implying that the "Big Design Up Front" wasn't actually doing any real thinking about modelling up front. But I've been in that room and I think a lot of it was cargo cult. To deal with the insanity of simple file I/O as data, I had to write a rudimentary atomicity system from scratch in order to fix the dumb corruption issues of their design when I would have got that for free with Sqlite.