Remix.run Logo
rubenvanwyk 2 hours ago

I've never understood why people say Feather file format isn't meant for "long-term" storage and prefer Parquet for that. Access is much faster from Feather, compression better with Parquet but Feather is really good.

sheepscreek 34 minutes ago | parent [-]

Honestly I think Arrow makes Feather redundant. To answer your question, Parquet is optimized for storage on disk - can store with compression to take leas space, and might include clever tricks or some form of indices to query data from the file. Feather on the other hand is optimized for loading onto memory. It uses the same representation on disk as it does in memory. Very little in the way of compression (if any). No optimized for disk at all. BUT you can memory map a Feather file and randomly access any part of it in O(1) time (I believe, but do your own due diligence :)