Remix.run Logo
eythian 2 days ago

I was not aware there was an RFC for CSV, but the concept of "simple comma-separated UTF-8 CSV" is, in my experience, not something that exists. In a previous job, a chunk of my work was taking CSV files that were given to us and writing tooling to process them into a structured form for import elsewhere (typically we'd do a few test runs, and finally do a cut-over with final data, so it had to be scripted.)

During this I saw just about every variant of CSV and character encoding known to man, often inside the same file. Once I had a file that had UTF-8, MARC-8, Latin1, and (yes really) VT100 control codes. All in one file.

All in all, I'd prefer something that actually could be validated for some sort of correctness (this said, another time I got an XML export from some software that was invalid XML, so...)