Remix.run Logo
sdairs 4 days ago

In your experience, how are folks doing (1)? The post is talking about a framework to add e.g. type safety, schema-as-code, etc. over assets in data infra in a familiar way as to what is common with Postgres; I'm not familiar with much else out there for that?

getnormality 3 days ago | parent [-]

Python, R, and Julia all have at least one package that defines a tabular data type. That means we can pass tables to functions, use them in classes, write tests for them, etc.

In all of these packages, the base tabular object you get is a local in-memory table. For manipulating remote SQL database tables, the best full-featured object API is provided by R's dbplyr package, IMHO.

I think Apache Spark, Apache Ibis, and some other big data packages can be configured to do this too, but IMHO their APIs are not nearly as helpful. For those who (understandably) don't want to use R and need an alternative to dbplyr, Apache Ibis is probably the best one to look at.