Remix.run Logo
dofm 4 hours ago

R slop. Oof.

What an awful thing to imagine. It's already the programming language of choice for egregious abuses of good practice.

ActionHank 4 hours ago | parent | next [-]

I do wonder if there isn't enough computer science / software engineering that is being taught as part of data science.

People I've worked with that used R and manged data / did analysis didn't really seem too concerned with long term maintenance.

Secondary observation, these same people were the first to preach for the AI coding gospel.

dofm 4 hours ago | parent | next [-]

One of the things that always reassures me about LLMs is that as well as being trained on languages with reasonably well-designed grammars, they will also have seen lots of examples of good practice in their training set.

Two things that make me wonder if they can possibly turn out good quality R.

Perhaps a true test of AGI will be when you ask it to write an application in R and it refuses for fear of what people might think.

mr_toad 4 hours ago | parent | prev | next [-]

> People I've worked with that used R and manged data / did analysis didn't really seem too concerned with long term maintenance.

Unless you’re the poor schmuck who is given the task of running the code written by the previous analyst, who has probably already left the company. Often it’s easier to just throw something together from scratch and then look for a new job, perpetuating the problem.

ngriffiths 3 hours ago | parent | prev | next [-]

At my job I switch between writing analysis code for research projects and writing code for apps. The difference in mindset is so dramatic. In the same way that good software has consistent names and interfaces that are ~useless when you just need the code to run once, research code has its own requirements that are ~useless in software. It's honestly a big challenge to switch back and forth. So I think it just reflects the main skillset of the people who use it (caring is not enough).

mjhay 4 hours ago | parent | prev [-]

Bingo. The typical data scientist has a masters or PhD in a non-CS quantitative field, and has had exactly zero CS or software eng classes. It’s a shame, because once you get over some of the idiosyncrasies, R is a really powerful and flexible functional language.

buellerbueller 4 hours ago | parent | prev [-]

Conversely, it is the programming language of choice for people who don't assume that their expertise on one domain (data science) translates into expertise in the whole of human knowledge (as we often see among techbros generally and here specifically).

As a working data scientist, I know I am not a computer scientist or a 10x engineer (hell, I am probably a 0.8x engineer), but that's not where my expertise is. My engineer co-workers are 0.01x data scientists, but you won't see me complaining that they don't know the Central Limit Theorem or how to build a causal inference engine.

malshe 2 hours ago | parent | next [-]

Your comment reminds me of a techbro blog I came across a few years ago. He was an influential "data scientist" on Twitter with CS rather than stats/econometrics background. In this post, he literatlly used linear regression on a categorical dependent variable. He just relabeled the categories 1, 2, 3, etc. Worse, when people pointed out the problem to him, he couldn't understand what was wrong about it and started pushing back.

It's been a while so I don't remember any details. I don't go on Twitter/X as much as I used to in those days.

4 hours ago | parent | prev [-]
[deleted]