Remix.run Logo
locknitpicker 17 hours ago

> Normalization is possible but not practical in a lot of cases: nearly every “legacy” database I’ve seen has at least one table that just accumulates columns because that was the quickest way to ship something.

Strong disagree. I'll explain.

Your argument would support the idea of adding a few columns to a table to get to a short time to market. That's ok.

Your comment does not come close to justify why you would keep the columns in. Not the slightest.

Tables with many columns create all sorts of problems and inefficiencies. Over fetching is a problem all on itself. Even the code gets brittle, where each and every single tweak risks beijg a major regression.

Creating a new table is not hard. Add a foreign key, add the columns, do a standard parallel write migration. Done. How on earth is this not practical?

fiddlerwoaroof 7 hours ago | parent | next [-]

I’m not justifying the design but splitting a table with several billion rows is not a trivial task, especially when ORMs and such are involved. Additionally, it’s easier to get work scheduled to ship a feature than it is to convince the relevant players to complete the swing.

locknitpicker 6 hours ago | parent [-]

> I’m not justifying the design but splitting a table with several billion rows is not a trivial task, especially when ORMs and such are involved.

I don't agree. Let me walk you through the process.

- create the new table - follow a basic parallel writes strategy -- update your database consumers to write to the new table without reading from it -- run a batch job to populate the new table with data from the old table -- update your database consumer to read from the new table while writing to both old and new tables

From this point onward, just pick a convenient moment to stop writing to the old database and call the migration done. Do post-migrarion cleanup tasks.

> Additionally, it’s easier to get work scheduled to ship a feature than it is to convince the relevant players to complete the swing.

The ease of piling up technical debt is not a justification to keep broken systems and designs. It's only ok to make a messs to deliver things because you're expected to clean after yourself afterwards.

grey-area 16 hours ago | parent | prev | next [-]

There are sometimes reasons this is harder in practice, for example let’s say the business or even third parties have access to this db directly and have hundreds of separate apps/services relying on this db (also an anti-pattern of course but not uncommon), that makes changing the db significantly harder.

Mistakes made early on and not corrected can snowball and lead to this kind of mess, which is very hard to back out of.

magicalhippo 11 hours ago | parent | prev [-]

> How on earth is this not practical?

Fine, but you still need to read in those 100+ fields. So now you gotta contend with 20+ joins just to pull in one record. Not more practical than a single SELECT in my opinion.

sublinear 9 hours ago | parent [-]

You don't need to join what you don't actually need. You also need to be careful writing your queries, not just the schema. The most common ones should be wrapped in views or functions to avoid the problem of everyone rolling their own later.

Performance generally isn't an issue for an arbitrary number of joins as long as your indices are set up correctly.

If you really do need a bulk read like that I think you want json columns, or to just go all in with a nosql database. Even then, the above regarding indexing is still true.