Remix.run Logo
Lyngbakr 15 hours ago

I was a bit disappointed to discover that this was essentially an R vs. Python article, which is a data science trope. I've been in the field for 20+ years now and while I used to be firmly on team R, I now think that we don't really have a good language for data science. I had high hopes for Julia and even Clojure's data landscape looks interesting, but given the momentum of Python I don't see how it could be usurped at this point.

vkazanov 15 hours ago | parent | next [-]

It is EVERYWHERE. I recently had to interview a bunch of data scientists, and only one of them knew SQL. Surely, all of then worked with python. I bet none of them even heard of R.

garciasn 15 hours ago | parent | next [-]

SAS > R > Python.

The focus of SAS and R were primarily limited to data science-related fields; however, Python is a far more generic programming language, thus the number of folks exposed to it is wider and thus the hiring pool of those who come in exposed to Python is FAR LARGER than SAS/R ever were, even when SAS was actively taught/utilized in undergraduate/graduate programs.

As a hiring leader in the Data Science and Engineering space, I have extensive experience with all of these + SQL, among others. Hiring has become much easier to go cross-field/post-secondary experience and find capable folks who can hit the ground running.

username135 15 hours ago | parent [-]

you beat me to it. i understand why sas gets hate but I think that comes with simply not understanding how powerful it is.

garciasn 15 hours ago | parent [-]

It was a great language, but it was/is extremely cost-prohibitive plus it simply fell out of favor in academia, for many of the same reasons, and thus was supplanted by free alternatives.

Lyngbakr 15 hours ago | parent | prev [-]

Yikes. Were they experienced data scientists or straight out of school? I find it very odd (and a bit scary) that they didn't know SQL.

garciasn 15 hours ago | parent [-]

Experienced Data Scientists and/or those straight out of school are EXTREMELY lacking in valuable SQL experience and always have been. Take a DS with 25 years experience in SAS, many of them are great with DATAstep, but have far less experience using PROC SQL for querying the data in the most effective way--even if they were pulling the data down with pass-through via SAS/ACCESS.

Often they'd be doing very simplistic querying and then manipulating via DATAstep prior to running whatever modeling and/or reporting PROCs later, rather than pushing it upstream into a far faster native database SQL pull via pass-through.

Back in 2008/2009, I saved 30h+ runtime on a regular report by refactoring everything in SQL via pass-through as opposed to the data scientists' original code that simply pulled the data down from the external source and manipulated it in DATAstep. Moving from 30h to 3m (Oracle backend) freed up an entire FTE to do more than babysit a long-running job 3x a week to multiple times per day.

SiempreViernes 15 hours ago | parent | prev | next [-]

What would it even mean to be a "good language for data science"?

In the first place data science is more a label someone put on bag full of cats, rather than a vast field covered by similarly sized boxes.

username135 15 hours ago | parent | prev [-]

SAS has entered the chat