| ▲ | crystal_revenge 6 hours ago | |||||||||||||||||||
You can assert whatever you want, but Polars is a great answer. The performance improvements are secondary to me compared to the dramatic improvement in interface. Today all serious DS work will ultimately become data engineering work anyway. The time when DS can just fiddle around in notebooks all day has passed. | ||||||||||||||||||||
| ▲ | this_user 4 hours ago | parent | next [-] | |||||||||||||||||||
Pandas is widely adopted and deeply integrated into the Python ecosystem. Meanwhile, Polars remains a small niche, and it's one of those hype technologies that will likely be dead in 3 years once most of its users realise that it offers them no actual practical advantages over Pandas. If you are dealing with huge data sets, you are probably using Spark or something like Dask already where jobs can run in the cloud. If you need speed and efficiency on your local machine, you use NumPy outright. And if you really, really need speed, you rewrite it in C/C++. Polars is trying to solve an issue that just doesn't exist for the vast majority of users. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | SiempreViernes 2 hours ago | parent | prev [-] | |||||||||||||||||||
>Today DS work will ultimately become data engineering work anyway. Oh yeah? Well in my ivory tower the work stops being serious once it becomes engineering, how do you like that elitism?! | ||||||||||||||||||||
| ||||||||||||||||||||