Remix.run Logo
rithdmc 5 hours ago

Thanks, I'll look into this in the future. I don't need the most performant script, but this could change.

ertgbnm 5 hours ago | parent [-]

It's less about performance and more about ecosystem lockin. It's a bit like imperial vs metric units. Why would you ever chose to learn imperial if you had the option to only ever use metric to begin with?

dec0dedab0de 4 hours ago | parent | next [-]

That's exactly why I am reluctant to do anything with Polars. They are actively running a company and trying to sell a product. At any point they could be acquired and change the license for new releases. Sure you could fork it, or stay on an older version, but if what they offer isn't compelling enough for you then why take the risk?

Pandas on the other hand has been open source for almost two decades, and is supported by many companies. They have a governance board, and an active community. The risk of it going off the rails into corporate nonsense is much lower.

xpe 2 hours ago | parent [-]

I would broaden the list of risks:

- Pandas is interwoven into downstream projects. So it will be here to stay for a long time. This is good for maintenance and stability. Advantage: Pandas.

- OTOH, the Pandas experience is awful; this was obvious to many from the outset, and yet it persisted. I haven't tracked the history. But my guess would be the competition from Polars was a key pressure for improvement. Edge: Polars.

- Lots of Python projects are moving to Rust-backed tooling: uv, Polars, etc. Front-end users get the convenience of Python and tool-developers get the confidence & capabilities of Rust. Edge: Polars.

- Pandas has a governance structure not tied to one company. Polars does not. (comment above said this) Advantage: Pandas.

But this could change. Polars users could (and may already be?) pressing for company-independent governance.

rithdmc 5 hours ago | parent | prev | next [-]

Because these are silly personal scripts. I'm not going to make sensible architectural decisions on something I run every now and then on my laptop. That's optimising too early.

short_sells_poo 4 hours ago | parent | prev [-]

For short scripts and interactive research work, pandas is still much better than polars. Polars works well when you know what you want.

When you are still figuring out things step by step, pandas does a lot of heavy lifting for you so you don't have to think about it.

E.g. I don't have to think about timeseries alignment, pandas handles that for me implicitly because dataframes can be indexed by timestamps. Polars has timeseries support, but I need to write a paragraph of extra code to deal with it.