Remix.run Logo
drillsteps5 a day ago

I would separate dimensional or relational modeling from data governance. You can dump all the data you have in one S3 bucket, and AWS can take care of near-real-time ingestion. So no need for staging, ODS, data warehouse, data marts, hub and spoke, all that jazz. And no data modeling required, just ingest as is and dump it there. Great.

Now what is this dump good for? It's just bunch of bytes of information which now needs to be interpreted. There's different perspectives (sales vs manufacturing vs procurement vs finance etc). There's data quality issues that need to be identified and resolved. There's PII and other compliance stuff. You have to watch out for giving permissions to sensitive information (ever dealt with payroll data? It's fun) Your data dump isn't doing any of that by itself. And I think people tend to simply stop at the data dump stage and then give access to analysts and data scientists and tell them to go do reports and outbound data feeds.

With obvious results.

OoooooooO a day ago | parent [-]

That's how you get 3 different values for a core KPI in 3 dashboards.

Then you look under the hood of the dashboards, only to see that not a single one follows the official definition of the business.