| ▲ | skeeter2020 7 hours ago | |||||||||||||||||||
>> some performance comparisons vs sqlite. That's not really the purpose; it's really a language-independent format so that you don't need to change it for say, a dataframe or R. It's columnar because for analytics (where you do lots of aggregations and filtering) this is way more performant; the data is intentionally stored so the target columns are continuous. You probably already know, but the analytics equivalent of SQLite is DuckDB. Arrow can also eliminate the need to serialize/de-serialize data when sharing (ex: a high performance data pipeline) because different consumers / tools / operations can use the same memory representation as-is. | ||||||||||||||||||||
| ▲ | mandeepj 5 hours ago | parent | next [-] | |||||||||||||||||||
> Arrow can also eliminate the need to serialize/de-serialize data when sharing (ex: a high performance data pipeline) because different consumers / tools / operations can use the same memory representation as-is. Not sure if I misunderstood, what are the chances those different consumers / tools / operations are running in your memory space? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | actionfromafar 6 hours ago | parent | prev [-] | |||||||||||||||||||
Thanks! This is all probably me using the familiar sqlite hammer where I really shouldn't. | ||||||||||||||||||||