| ▲ | jerf 7 hours ago | |||||||
In a roundabout way this article captures well why I don't really like thinking in terms of "normal forms", especially as a numbered list like that. The key insights are really 1. Avoid redundancy and 2. This may involve synthesizing relationships that don't immediately obviously exist from a human perspective. Both of those can be expanded on at quite some length, but I never found much value in the supposedly-blessed intermediate points represented by the nominally numbered "forms". I don't find them useful either for thinking about the problem or for communicating about it. Someone, somewhere writing down a list and that list being blessed with the imprimatur of Academic Approval (TM) doesn't mean it is actually useful... sometimes it just means that it made it easy to write multiple choice test questions. (e.g., "What does Layer 2 of the OSI network model represent? A: ... B: ... C: ... D: ..." to which the most appropriate real-world answer is "Who cares?") | ||||||||
| ▲ | da_chicken 44 minutes ago | parent | next [-] | |||||||
I still see value in the numbering. Breaking 1NF is essentially always incorrect. You're fundamentally limiting your system, and making it so that you will struggle to perform certain queries. Only break 1NF when you're absolutely 100% certain that nobody anywhere will ever need to do anything even slightly complex with the data you're looking at. And then, probably still apply 1NF anyways. Everyone that ever has to use your system is going to hate you when they find this table because you didn't think of the situation that they're interested in. "Why does this query use 12 CTEs and random functions I've never heard of and take 5 minutes to return 20,000 rows?" "You broke 1NF." 2NF is usually incorrect to break. Like it's going to be pretty obnoxious to renormalize your data using query logic, but it won't come up nearly as frequently. If it's really never going to come up that often in practical terms, then okay. 3NF and BCNF are nice to maintain, but the number of circumstances where they're just not practical or necessary starts to feel pretty common. Further, the complexity of the query to undo the denormalization will not be as obnoxious as it is for 1NF or 2NF. But if you can do it, you probably should normalize to here. 4NF and higher continue along the same lines, but increasingly gets to what feels like pretty arbitrary requirements or situations where the cost you're paying in indexes is starting to become higher than the relational algebra benefits. Your database disk usage by table report is going to be dominated by junction tables, foreign key constraints, and indexes, and all you're really buying with that disk space is academic satisfaction. | ||||||||
| ||||||||
| ▲ | sgarland 39 minutes ago | parent | prev | next [-] | |||||||
What always frustrates me is that when people on here discuss deeply technical and/or meta-aspects of programming (e.g. type theory), it's taken at face value, but the same is not true of databases. They are generally treated as a dumb data store that you can throw anything into, and when someone explains why that's a bad idea, or why an academic concept like normal form is still applicable, it's met with criticism. Even when it's purely performance-related, it usually gets a shrug, and "it's good enough." Cool, you're wrecking the B+tree, maybe don't do that. It's as if I said, "I'm using an array to store these millions of items that I later need to de-duplicate," and when someone suggests maybe using a set, I dismiss it. | ||||||||
| ▲ | petalmind 7 hours ago | parent | prev | next [-] | |||||||
> Someone, somewhere writing down a list and that list being blessed with the imprimatur of Academic Approval (TM) One problem is that normal forms are underspecified even by the academy. E.g., Millist W. Vincent "A corrected 5NF definition for relational database design" (1997) (!) shows that the traditional definition of 5NF was deficient. 5NF was introduced in 1979 (I was one year old then). 2NF and 3NF should basically be merged into BCNF, if I understand correctly, and treated like a general case (as per Darwen). Also, the numeric sequence is not very useful because there are at least four non-numeric forms (https://andreipall.github.io/sql/database-normalization/). Also, personally I think that 6NF should be foundational, but that's a separate matter. | ||||||||
| ||||||||
| ▲ | wolttam 6 hours ago | parent | prev | next [-] | |||||||
Why shouldn’t we care about layer 2? You can do really fun and interesting things at the MAC layer. | ||||||||
| ||||||||
| ▲ | awesome_dude 2 hours ago | parent | prev [-] | |||||||
The levels do the most important thing in computer science, give discrete and meaningful levels to talk/argue about at the watercolour | ||||||||