Remix.run Logo
Freak_NL 3 days ago

If you ever write some HTTP endpoint where tabulated data is returned, you could quite reasonably return RFC 4180 style CSV.

However, if your API ever interfaces with users in a corporate environment, parsing simple comma-separated UTF-8 CSV is suddenly quite beyond the reach of however is nibbling at your endpoint, so why not code up a simple little reusable bit of code where you can write any simple tabular data (string, numbers, and dates, in one or more sheets of data made up of rows and columns) that lets you choose the output format? A zip-archive of CSV-files (one per sheet), JSON, ODS, or XLSX; pick your poison.

I did just that, and while it is perfectly doable, any low-level, low-resources, low-dependency approach will mean actually touching the XML in LibreOffice's ODS (fine), and Microsoft's OOXML (…).

This is how you write a date in a cell in both.

ODS:

            <table:table-row table:style-name="ro1">
                <table:table-cell office:value-type="date" office:date-value="2021-04-10T12:34:56" calcext:value-type="date">
                    <text:p>10/4/2021, 12:34</text:p>
                </table:table-cell>
            </table:table-row>
            
OK, a bit verbose, but trivial to implement. Format the date however you like — you'll probably use two different formatters on the same datetime instant.

XLSX (OOXML):

            <row r="1" ht="12.8">
                <c r="A1" s="1" t="n">
                    <v>39448.5</v>
                </c>
            </row>
Obviously, as you can all plainly see, the date here is 2008-01-01T12:00:00…

And of course it makes perfect sense to hardcode the cell coordinate there. It's not like you would dynamically generate a bunch of cells (…).

boricj 3 days ago | parent | next [-]

> However, if your API ever interfaces with users in a corporate environment, parsing simple comma-separated UTF-8 CSV is suddenly quite beyond the reach of however is nibbling at your endpoint

Excel can directly ingest a CSV file served over an URL as data source, with the Accept header manually set to text/csv.

I wrote a backend once that supported this feature so that management could pull whatever data they wanted off an internal application without pestering me. They could literally take the URL of a page and pull it as a CSV file as-is.

Anybody who knows a bit of Excel can pull that data themselves by following a set of simple instructions.

Freak_NL 3 days ago | parent | next [-]

> Anybody who knows a bit of Excel can pull that data themselves by following a set of simple instructions.

That is very much possible. It is also completely impossible when you live in a country where Microsoft decreed that the C in CSV stands for semicolon; as far as Excel is concerned (no, seriously). Welcome in the Netherlands!

Now whether or not Excel can open a CSV file depends on the locale of the user, which will inevitably vary, and of course, whether they are using Excel at all.

So yes, you could offer just CSV, but not if your user is a spreadsheet jockey and you would like to stay on good terms with your support staff.

1718627440 2 days ago | parent [-]

I don't use MS Office, but does Excel seriously not simply show a dialog, where the user can select this and maybe even auto-detects it? That's what I'm used to from the "subpar clone" (/s) from the Document Foundation.

kiicia 2 days ago | parent [-]

It mostly did that until they changed it few versions ago, now it sometimes does, sometimes does not and sometimes falls flat on face depending on exact context of your action…

noAnswer 3 days ago | parent | prev [-]

For all my life, whenever I File > Opened a CSV file with Excel, all its content ended up in column A. I always have to work via Import Data and specify the file encoding and what the C in CSV stands for.

eythian 2 days ago | parent | prev | next [-]

I was not aware there was an RFC for CSV, but the concept of "simple comma-separated UTF-8 CSV" is, in my experience, not something that exists. In a previous job, a chunk of my work was taking CSV files that were given to us and writing tooling to process them into a structured form for import elsewhere (typically we'd do a few test runs, and finally do a cut-over with final data, so it had to be scripted.)

During this I saw just about every variant of CSV and character encoding known to man, often inside the same file. Once I had a file that had UTF-8, MARC-8, Latin1, and (yes really) VT100 control codes. All in one file.

All in all, I'd prefer something that actually could be validated for some sort of correctness (this said, another time I got an XML export from some software that was invalid XML, so...)

jeroenhd 2 days ago | parent | prev [-]

Storing dates as numbers in a spreadsheet has been a thing since the first spreadsheet program I know of. Microsoft picked "days since 1900". If you're on UNIX, you may prefer using "1199188800" instead.

Other than that, the difference is pretty minor. ODS is very verbose and stores the content of the cell twice for some reason, but the XML trees are essentially the same.

The best way for corporate interaction is to export to whatever the hell Microsoft Excel accepts as an external data source, because .xslx files can natively import remote data that way. Hope your customers' computers are all configured for en_US mode, though, because CSVs aren't as universal as people pretend they are.

Freak_NL 2 days ago | parent [-]

I'm somewhat fine with the unix epoch timestamp, but this thing is not just seconds or microseconds since x, it is days since x, plus the rest of the time bits as a fraction. One second to midnight on January 1st 2008 is… 39448.9999884259.

Oh, and while 39448.5 is fine, 39448.0 makes Excel throw an error and refuse the whole document. Midnight January 1st 2008 is just 39448. The parser cannot handle 39448.0.