Remix.run Logo
vegabook 2 hours ago

"revolutionary"? It just copied and pasted the decades-old R (previous "S") dataframe into Python, including all the paradigms (with worse ergonomics since it's not baked into the language).

sampo 17 minutes ago | parent | next [-]

This is an interesting question.

Dataframes first appeared in S-PLUS in 1991-1992. Then R copied S, and from 1995-1996-1997 onwards R started to grow in popularity in statistics. As free and open source software, R started to take over the market among statisticians and other people who were using other statistical software, mainly SAS, SPSS and Stata.

Given that S and R existed, why were they mostly not picked up by data analysts and programmers in 1995-2008, and only Python and Pandas made dataframes popular from 2008 onwards?

data-ottawa an hour ago | parent | prev | next [-]

No other modern language will compete with R on ergonomics because of how it allows functions to read the context they’re called in, and S expressions are incredibly flexibly. The R manual is great.

To say pandas just copied it but worse is overly dismissive. The core of pandas has always been indexing/reindexing, split-apply-combine, and slicing views.

It’s a different approach than R’s data tables or frames.

xtracto an hour ago | parent | prev | next [-]

Exactly. I was programming in R in 2004 and Pandas didnt exist. I remember trying Pandas once and it felt unergonomic for fata analysis and it lacked the vast library of statistical analysis library.

BeetleB an hour ago | parent | prev [-]

It was revolutionary to Python. Without NumPy and Pandas, ML in Python would never have been a thing.

(Yes, yes - I know some people wish that were the case!)