Remix.run Logo
aquafox 2 hours ago

I really don't understand why people think it's a good idea to use csv. In english settings, the comma can be used as 1000-delimiter in large numbers, e.g. 1,000,000 for on million, in German, the comma is used as decimal place, e.g. 1,50€ for 1 euro and 50 cents. And of course, commas can be used free text fields. Given all that, it is just logical to use tsv instead!

teekert 22 minutes ago | parent | next [-]

I learned to program at 33 or so (in bioinformatics), my first real lesson a couple of days in: "Never ever use csv". I've never used pd.read_csv() without sep="\t". Idk where csv came from, and who thought it was a good idea. It must have been pre-spreadsheet because a tab will put you in the next cell so tabs can simply never be entered into any table by our biologist colleagues.

I guess it's also why all our fancy (as in tsv++?) file types (like GTF and BED) are all tab (or spaces) based. Those fields often have commas in cells for nested lists etc.

I wish sep="\t" was default and one would have to do pd.read/to_tsv(sep=",") for csv. It would have saved me hours and hours of work and idk cross the 79 chars much less often ;)

prerok an hour ago | parent | prev | next [-]

Funny story: I once bought and started up Galactic Civilizations 3.

It looked horrible, the textures just wouldn't load no matter what I tried. Finally, on a forum, some other user, presumably also from Europe, noted that you have to use decimal point as a decimal separator (my locale uses a comma). And that solved the problem.

Balinares 33 minutes ago | parent | prev | next [-]

It's one of those things where people think, it's there, and it works.

The whole business of software engineering exists in the gap between "it works today on this input" and "it will also work tomorrow and the day after and after we've scaled 10x and rewrote the serialization abstraction and..."

See also: "Glorp 5.7 Turbo one-shot this for me and it works!"

andix 15 minutes ago | parent | prev | next [-]

JSON, just use JSON. Or XML, if you don't like JSON.

toolslive 10 minutes ago | parent [-]

JSON brings its own set of problems. for example, look at the python generated JSON below.

    >  >>> json.dumps({ "X" : 1 << 66 })
    > '{"X": 73786976294838206464}'
What's the parsing result in javascript ? What's the parsing result in Java ?
andix 8 minutes ago | parent [-]

What's the difference to CSV?

  number,73786976294838206464
toolslive a minute ago | parent [-]

For CSV, I don't know how this comes out. It depends on the library/programming language. It might be 73786976294838210000 or it might throw an exception, or whatever. I'm just saying JSON will not solve your problems neither.

nxpnsv an hour ago | parent | prev | next [-]

Yes, but tabs also can appear in text fields. If you are free to pick not csv, then perhaps consider feather or parquet?

mulmen 21 minutes ago | parent | prev | next [-]

If you snapped your fingers and removed CSVs from the world your lights would go out within the hour and you'd starve within the week. Trillions of dollars in business are done every day with separated values files and excel computations. The human relationships solve the data issues.

24 minutes ago | parent | prev [-]
[deleted]