| ▲ | stared 6 hours ago | |||||||||||||
I dislike the premise. I mean, good data is wonderful. But if institutions are expected to release clear data or nothing, almost always it is the later. What is important, is to offer as much methodology and caveats as possible, even if in an informal way. Because there is a difference between "data covers 72% of companies registered in..." vs expecting that data is full and authoritative, whereas it is missing. (Source: 10 years ago I worked a lot with official data. All data requires cleaning.) | ||||||||||||||
| ▲ | Mordisquitos 5 hours ago | parent | next [-] | |||||||||||||
But surely we should expect some basic sanity checks on published data? This isn't some petrol stations being placed in the middle of a field due to minor typos or bad rounding, or some petrol stations' prices being listed as all 1.00 £/l out of laziness, or even a case of all unknown locations being listed as 0°0'0" N, 0°0'0" E by default. What the author reports appear to be mistakes which should be rather trivially detectable on input. | ||||||||||||||
| ||||||||||||||
| ▲ | freehorse 5 hours ago | parent | prev | next [-] | |||||||||||||
I don't think these issues are close to the issues the article talks about. The author does not talk about data coverage, data collection methodologies or missing values or whatever, but data that is actually wrong, ie location coordinates, prices, numbers that make no sense. Including swapping latitude/longitude and wrong decimal points in numbers. On the other hand, I agree that bad (but usually fixable) data is better than no data. | ||||||||||||||
| ||||||||||||||
| ▲ | sd9 6 hours ago | parent | prev | next [-] | |||||||||||||
Agreed, pretty much all data is flawed. I still want my hands on it. | ||||||||||||||
| ▲ | readthenotes1 5 hours ago | parent | prev [-] | |||||||||||||
I read the premises as "1. at least look at it 2. Have a way to fix it" Those seem reasonable asks. Edit to add: the tragedy of the school in Minab is an example of how bad things can go--and it just hints at how much worse bad data can bem | ||||||||||||||