Remix.run Logo
rscho 3 days ago

This typing behaviour, now in combination with strict tables, is a boon for biostats. When you get shitty data to be cleaned, you've got 3 main choices: 1.use slow and untyped scripting languages, 2.use a strictly typed database, meaning you'll have to clean your data in advance, or 3.load it all as strings into SQLlite, then clean the data until it fits into a strict table with check constraints. IMO, it's pretty clear 3 is best by far!

coliveira 3 days ago | parent | next [-]

I agree that 3 is great, but you can also do that in any database, just create your input tables as string only and then perform the necessary operations to move them into typed tables.

rscho 3 days ago | parent [-]

Yes, but with sqlite there is much less ceremony (no server, etc.) and most importantly can be used without talking to my institution's sysadmin, which is what I'm looking for when manipulating one-off datasets.

mb7733 3 days ago | parent [-]

But that advantage has nothing to do with accepting data of the wrong type into a column (by default).

kstrauser 3 days ago | parent | prev [-]

Is anyone using untyped languages much today, other than shell scripts?

colejohnson66 3 days ago | parent | next [-]

CMake is entirely stringly-typed as well. Like many shells, arrays/lists are just space-separated strings.

mdaniel 2 days ago | parent [-]

Pedantically that's not true, they're ';' delimited https://cmake.org/cmake/help/v3.31/command/list.html#:~:text...

The confusion comes from the fact that set() automatically coerces space-delimited items into a ;-delimited list

  set(ONE alpha;beta)
  set(TWO alpha beta)
  list(LENGTH ONE one_len)
  list(LENGTH TWO two_len)
  message(FATAL_ERROR "one <<${ONE}>> length ${one_len}\ntwo <<${TWO}>> length ${two_len}")
emits

  one <<alpha;beta>> length 2

  two <<alpha;beta>> length 2
nilamo 3 days ago | parent | prev [-]

Most people via JavaScript...