Remix.run Logo
rscho 10 months ago

This typing behaviour, now in combination with strict tables, is a boon for biostats. When you get shitty data to be cleaned, you've got 3 main choices: 1.use slow and untyped scripting languages, 2.use a strictly typed database, meaning you'll have to clean your data in advance, or 3.load it all as strings into SQLlite, then clean the data until it fits into a strict table with check constraints. IMO, it's pretty clear 3 is best by far!

coliveira 10 months ago | parent | next [-]

I agree that 3 is great, but you can also do that in any database, just create your input tables as string only and then perform the necessary operations to move them into typed tables.

rscho 10 months ago | parent [-]

Yes, but with sqlite there is much less ceremony (no server, etc.) and most importantly can be used without talking to my institution's sysadmin, which is what I'm looking for when manipulating one-off datasets.

mb7733 10 months ago | parent [-]

But that advantage has nothing to do with accepting data of the wrong type into a column (by default).

kstrauser 10 months ago | parent | prev [-]

Is anyone using untyped languages much today, other than shell scripts?

colejohnson66 10 months ago | parent | next [-]

CMake is entirely stringly-typed as well. Like many shells, arrays/lists are just space-separated strings.

mdaniel 10 months ago | parent [-]

Pedantically that's not true, they're ';' delimited https://cmake.org/cmake/help/v3.31/command/list.html#:~:text...

The confusion comes from the fact that set() automatically coerces space-delimited items into a ;-delimited list

  set(ONE alpha;beta)
  set(TWO alpha beta)
  list(LENGTH ONE one_len)
  list(LENGTH TWO two_len)
  message(FATAL_ERROR "one <<${ONE}>> length ${one_len}\ntwo <<${TWO}>> length ${two_len}")
emits

  one <<alpha;beta>> length 2

  two <<alpha;beta>> length 2
nilamo 10 months ago | parent | prev [-]

Most people via JavaScript...