Remix.run Logo
largbae 5 hours ago

This could use a bit more "why".

Shortcomings of Parquet are mentioned as overcome by this, which ones? Certainly not wide tool support...

Why should one leave Parquet or ORC for this structure?

altairprime 5 hours ago | parent | next [-]

The ‘why’ is referenced in the bibliography at the end of the readme; this repo is not meant to be consumed standalone. Start with the paper instead:

https://doi.org/10.1145/3749163

dietr1ch 3 hours ago | parent | prev | next [-]

I also had no idea what they were talking about, but there's good points about how hardware oblivious and somewhat global is Parquet around metadata.

I found this post interesting,

- https://medium.com/@reliabledataengineering/f3-the-future-pr...

skrtskrt 4 hours ago | parent | prev | next [-]

Yeah it seems like most of this can be handled by some more dev hours to Parquet

dj_axl 4 hours ago | parent | prev [-]

Paper mentions Parquet, ORC, Nimble, Lance, TSFile, Bullion, and BtrBlocks.