▲ | sigwinch28 16 hours ago | |||||||||||||||||||||||||
Or it’s simply an indicator of a schema that has not been excessively normalised (why create an addresses_cities table just to ensure no duplicate cities are ever written to the addresses table?) | ||||||||||||||||||||||||||
▲ | valiant55 14 hours ago | parent | next [-] | |||||||||||||||||||||||||
It depends when you see it, but I agree that DISTINCT shouldn't be used in production. If I'm writing a one off query and DISTINCT gets me over the finish line sparing me a few minutes then that's fine. | ||||||||||||||||||||||||||
▲ | echelon 12 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||
DISTINCT, as well as the other aggregation functions, are fantastic for offline analytics queries. I find a lot of use for them in reporting, non-production code. | ||||||||||||||||||||||||||
▲ | sgarland 10 hours ago | parent | prev [-] | |||||||||||||||||||||||||
Because a city/region/state can be uniquely identified with a postal code (hell, in Ireland, the entire address is encapsulated in the postal code), but the reverse is not true. At scale, repeated low-cardinality columns matter a great deal. | ||||||||||||||||||||||||||
|