| ▲ | postalcoder 5 hours ago |
| I've migrated off of pandas to polars for my workflows to reap the benefit of, in my experience a 10-20x speedup on average. I can't imagine anything bringing me back short of a performance miracle. LLMs have made syntax almost a non-barrier. |
|
| ▲ | lvl155 4 hours ago | parent | next [-] |
| Went from pandas to polars to duckdb. As mentioned elsewhere SQL is the most readable for me and LLM does most of the coding on my end (quant). So I need it at the most readable and rudimentary/step-wise level. OT, but I can’t imagine data science being a job category for too long. It’s got to be one of the first to go in AI age especially since the market is so saturated with mediocre talents. |
| |
| ▲ | iugtmkbdfil834 2 hours ago | parent [-] | | << It’s got to be one of the first to go in AI age especially since the market is so saturated with mediocre talents. This is interesting. I wanted to dig into it a little since I am not sure I am following the logic of that statement. Do you mean that AI would take over the field, because by default most people there are already not producing anything that a simple 'talk to data' LLM won't deliver? | | |
| ▲ | mynameisash an hour ago | parent [-] | | Not GP, but as a data engineer who has worked with data scientists for 20 years, I think the assessment is unfortunately true. I used to work on teams where DS would put a ton of time into building quality models, gating production with defensible metrics. Now, my DS counterparts are writing prompts and calling it a day. I'm not at all convinced that the results are better, but I guess if you don't spend time (=money) on the work, it's hard to argue with the ROI? |
|
|
|
| ▲ | mritchie712 5 hours ago | parent | prev | next [-] |
| also migrated, but to duckdb. It's funny to look back at the tricks that were needed to get gpt3 and 3.5 to write SQL (e.g. "you are a data analyst looking at a SQL database with table [tables]"). It's almost effortless now. |
|
| ▲ | howling 4 hours ago | parent | prev | next [-] |
| Same. I don't even use LLM normally as I found polars' syntax to be very intuitive. I just searched my ChatGPT history and the only times I used it are when I'm dealing with list and struct columns that were not in pandas. |
| |
| ▲ | postalcoder 4 hours ago | parent [-] | | iirc part of pandas’ popularity was that it modeled some of R’s ergonomics. What a time in history, when such things mattered! (To be clear, I’m not making fun of pandas. It was the bridge I crossed that moved me from living in Excel to living in code.) | | |
| ▲ | iugtmkbdfil834 2 hours ago | parent [-] | | I learned about pandas with R in my class way back when. At the time, it seemed like magic. In a sense, it still does, but things evolve. |
|
|
|
| ▲ | thibaut_barrere 4 hours ago | parent | prev | next [-] |
| Polars being so fast, and embeddable into other languages, has made it a no brainer for me to adopt it. I have integrated Explorer https://github.com/elixir-explorer/explorer, which leverages it, into many Elixir apps, so happy to have this. |
| |
|
| ▲ | gHA5 5 hours ago | parent | prev | next [-] |
| Do you not experience LLM generated code constantly trying to use Pandas' methods/syntax for Polars objects? |
| |
| ▲ | edschofield 4 hours ago | parent | next [-] | | Yes, ChatGPT 5.2 Pro absolutely still does this. Just ask it for a pivot table using Polars and it will probably spit out code with Pandas arguments that doesn’t work. | | | |
| ▲ | postalcoder 5 hours ago | parent | prev [-] | | There were some growing pains in gpt-3.5 to gpt-4 era, but not nowadays (shoutout to the now-defunct Phind, which was a game changer back then). | | |
| ▲ | crimsoneer 4 hours ago | parent [-] | | The fact they pivoted away from their very compelling core offering (AI stack overflow) to complete with loveable etc in the "AI generated apps" giant fight continues to baffle me. Though I guess model updates ate their lunch. | | |
| ▲ | postalcoder 4 hours ago | parent [-] | | My guess is that their pivot came after distress, and was not the cause of it. It'd be great to have @rushingcreek write a post-mortem. I think it'd benefit a lot of people because I honestly don't have a monday morning playbook of what could have saved them. Like you said, perhaps the demise of phind was inevitable, with large models displacing them kind of like how Spotify displaced music piracy. |
|
|
|
|
| ▲ | thegabriele 2 hours ago | parent | prev | next [-] |
| " 10-20x speedup on average. " Is this everyone's experience? |
| |
| ▲ | OGWhales 24 minutes ago | parent | next [-] | | It depends on the specifics, but I converted a couple of scripts recently that would take minutes to run with Pandas that only took seconds to run with Polars. I was pretty impressed. | |
| ▲ | mynameisash an hour ago | parent | prev [-] | | That was probably about what I got when I migrated some heavy number crunching code from Pandas to Polars a few years ago. Maybe even better than that. |
|
|
| ▲ | alex7o 5 hours ago | parent | prev | next [-] |
| Same, also polars works on typescript which I used at some point out move my data from backend to frontend |
|
| ▲ | OutOfHere 4 hours ago | parent | prev [-] |
| The speedup you claim is going to be contingent on how you use Pandas, with which data types, and which version of Pandas. |