Remix.run Logo
freetinker 3 days ago

The comma makes it more human-readable. What separator would you suggest?

snthpy 3 days ago | parent | next [-]

So ASCII actually had dedicated characters for this, 0x1C-0x1F. The problem is that they are non-printing.

Unicode has rendered analogs, U+241C-U+241F, but they take more bytes to encode, which can significantly increase file size in large USV files.

So my ideal would be to use ASV files rendered as USV in editors.

https://github.com/SixArm/usv

snthpy 3 days ago | parent [-]

The benefits are that ASV / USV files are trivial to parse with simple string splitting since you don't have to worry about nesting and quoting.

Here's an example of what a USV looks like:

Folio1␟␞ Sheet1␟␞ a␟b␟␞ c␟d␟␞ ␝ Sheet2␟␞ e␟f␟␞ g␟h␟␞ ␝␜ Folio2␟␞ Sheet3␟␞ a␟b␟␞ c␟d␟␞ ␝ Sheet4␟␞ e␟f␟␞ g␟h␟␞ ␝␜

joz1-k 3 days ago | parent | prev | next [-]

The comma is too prevalent in the data to be a suitable separator. A semicolon would be a better choice.

r721 3 days ago | parent | prev [-]

"|" looks pretty good (and is relatively rarely-used).