Remix.run Logo
bobro 6 hours ago

This article assumes that there is a person with dedicated time to validate the data. Imagine you want this data and ask for it, but the government says, “sorry, we have this data, but we read an article that said we can only publish it if we spend a lot of time validating it. This data changes frequently and we don’t have a chunk of a full-time data analyst’s salary to spend on it, so we just aren’t going to publish anything. We’d rather put out nothing than embarrass ourselves, so you can’t even try to validate it yourself.”

chaps 6 hours ago | parent | next [-]

In fact, the government agencies will argue that they have zero legal obligation to clean the data, let alone figure anything about the data, and that they're just giving you the data as-is. This happened to me on a FOIA call where I was trying to get data from the county state's attorney. They insisted they could only run a specific report and that they had no obligation to run any query, meaning I can't even get access to the data I need.

Clean vs not clean data is the wrong fight.

hermitcrab 4 hours ago | parent | prev [-]

>we don’t have a chunk of a full-time data analyst’s salary to spend on it

I found the errors in a few minutes with a $198 tool.