| ▲ | Qem 4 days ago |
| I wonder how close R was to also take over the scientific computing/machine learning space, instead of Python's numpy/scipy ecosystem. |
|
| ▲ | teruakohatu 4 days ago | parent | next [-] |
| I love and use R, but it never became the dominant ML in part because it has three (or more) different object systems and many libraries sort of use their own style. This makes it seem a bit disjointed, in a way that other languages don’t. The R community should have anointed one object system and made tidyverse a core part of R. All that said, R is fantastic and the depth of libraries is extensive. Libs are often written by the original researchers that develop the method. At some academic institutions an R package is counted as a paper. |
| |
| ▲ | paddleon 4 days ago | parent | next [-] | | > The R community should have anointed one object system > and made tidyverse a core part of R. Not a tidyverse fan. It doesn't scale well. Learn data.table, which has a much more R-like interface and is fast fast fast even for large data sizes. More powerful and more expressive than pandas, and again, faster See https://cran.r-project.org/web/packages/data.table/vignettes... | | |
| ▲ | mscbuck 4 days ago | parent | next [-] | | And if you still prefer the language of tidyverse, use tidytable and you get the best of both worlds! | |
| ▲ | vovavili 4 days ago | parent | prev [-] | | I think these days it might be wiser to use r-polars, in the very least because it has more available documentation around it. |
| |
| ▲ | mvieira38 4 days ago | parent | prev | next [-] | | Agree 100% on tidyverse becoming part of the standard library. Some of the language's greatest libraries (like Hyndman's forecasting stuff) basically assume you're using tidyverse already | | |
| ▲ | kjkjadksj 4 days ago | parent [-] | | Screw that. I do everything that tidyverse does with the standard library already. No clue why the wheel had to be reinvented just to make a plot. |
| |
| ▲ | tylermw 4 days ago | parent | prev | next [-] | | The developing S7 object system (https://github.com/RConsortium/S7) is looking fairly promising in that it combines many of the nice properties of S3 and S4 (validation, multiple dispatch, sane constructors) while still being fairly simple and straightforward to use. | | |
| ▲ | northlondoner 4 days ago | parent [-] | | Excellent news. Quite promising, but R's power is its been actually functional natively. Even binary operations are functions, `+`(x,y) would work as in x+y |
| |
| ▲ | clircle 4 days ago | parent | prev [-] | | I have a feeling that most data scientists using R have no need to touch any of the object systems, hard to believe that would be a deal breaker. | | |
| ▲ | teruakohatu 4 days ago | parent [-] | | > most data scientists It's hard to generalise for all data scientists everywhere, but that is not my experience. Data transformation (80% of the job) is very functional and so objects systems don't matter much. But when you are training neural nets in Python you are probably using a framework of some type. Torch in R looks very object orientation'y . The issue is not that object orientation is fundamentally needed for data science, but when you install a random object orientated R library you get a random R object system or pseudo-object system that needs to be reasoned about. It is a pity R didn't just ditch object systems or adopt a limited simple system like Lua's table approach. |
|
|
|
| ▲ | analog31 4 days ago | parent | prev | next [-] |
| I think that early on, people started using Python because they liked the language, before they used it for numerical computation. Many people were satisfied with an installation of Matlab, C, or whatever, on their desks. But they started using Python as a scripting language, before asking: Wouldn't it be nice if we could use Python for everything? For instance in my own case, my first use of Python was outside of mainstream scientific computing. I needed something to install on lab computers, for data acquisition and automation. And it needed to be free because my employer was under a spending freeze after the 2008 financial meltdown. Oh, and I also wanted something for hobby projects, that would be equally at home on Windows or Linux. So I think the quality of the language came first. |
|
| ▲ | cactusfrog 4 days ago | parent | prev | next [-] |
| The issue with R is that there is too much dsl. This is great for one-off analysis but makes building a cohesive large code base really difficult. |
| |
| ▲ | UpsideDownRide 4 days ago | parent [-] | | Yeah that's def part of it. As fun as it is there is just too much of it and people jump for it too readily, tidyverse included. |
|
|
| ▲ | mhog_hn 4 days ago | parent | prev | next [-] |
| One general purpose web framework away |
| |
| ▲ | rjdj377dhabsn 4 days ago | parent [-] | | I disagree. R is just not a very nice language. It has some really great statistical and data science packages that were well ahead of the competition 10-15 years ago. The web frameworks were good enough for dashboards and what most people were using R for. But if you wanted to write fast and elegant nom-vectorized code, R is really lacking. I left it for Julia for that reason. | | |
| ▲ | mvieira38 4 days ago | parent | next [-] | | How is Julia in terms of data science dev experience? Nothing ever felt as good as the R+tidyverse combo to me, at least in Python. | | |
| ▲ | rjdj377dhabsn 4 days ago | parent [-] | | Julia is pretty good at basic data science. Working with dataframes is comparable to R's data.tables with the benefit that I don't need to switch languages if I want to run a fast loop over some data as part of a calculation or use a custom data structure. I'm not a fan of pandas, so I'd say Julia and R beat python at basic dataframe manipulation. Nothing beats kdb+/q at dataframes though imo. | | |
| ▲ | mvieira38 4 days ago | parent [-] | | Have you tried Polars in Python? When you get going it's pretty similar to tidyverse, except you're chaining methods instead of piping, and it's lazily evaluated + parallel because of the underlying Rust engine. IME it's tidyverse > polars > pandas > data.table in terms of ergonomics |
|
| |
| ▲ | mhogers 4 days ago | parent | prev [-] | | I agree somewhat with you - nonetheless a FastAPI + Alembic + SQLAlchemy alternative in R would make it possible to use it as a general purpose language |
|
|
|
| ▲ | shiandow 4 days ago | parent | prev | next [-] |
| In statistical physics they still use C a lot, as far as I know. |
| |
| ▲ | northlondoner 4 days ago | parent [-] | | Good observation. IsingLenzMC indeed core is written in C. R provides great C interfacing facilities. |
|
|
| ▲ | mamami 4 days ago | parent | prev | next [-] |
| It was never close. Its synthax is unintuitive and painful to learn as a science undergrad. If it hadn't been python it would have been another language. |
| |
| ▲ | physicsguy 4 days ago | parent | next [-] | | Python's rapid adoption really came out of NumPy, SciPy, Matplotlib copying the interfaces from MATLAB, which was very widely used before but obviously had a cost associated. | |
| ▲ | UpsideDownRide 4 days ago | parent | prev [-] | | This is obviously a personal thing but tidyverse syntax is great and lends itself very well to clear and concise data operations. | | |
| ▲ | kjkjadksj 4 days ago | parent [-] | | I found base R even easier than tidyverse. Geom?? Puke. Just call the plot function you want from the standard library. Everything is just function(arg1=x, arg2=y). Easy. |
|
|
|
| ▲ | larrydag 4 days ago | parent | prev | next [-] |
| Very close. In fact you could still say that it still is competing with Python for users. There is still an active community of developers. |
|
| ▲ | 3abiton 4 days ago | parent | prev [-] |
| R is really not for production deployment. It lacks a lot of what made python popular, and its target users were radically different. |
| |
| ▲ | shoo 4 days ago | parent | next [-] | | R was developed for and by statisticians, for better and worse. I used R a little bit 15-20 years ago, what I remember was that quite a few libraries and function interfaces seemed to be designed to be convenient for interactive use, but if you tried to use them in an automated script, e.g. some analysis you wanted to scale up and repeat 10,000 times while bootstrap sampling or hyperparameter sweeping or what have you, those same library and interface design choices involved bizarre edge cases where functions would sometimes do something completely different (perhaps changing the return type) when invoked with slightly different arguments. All these automation hostile edge causes were annoying to discover and then work around. None of this was forced by R the language, it was purely a library design thing by the folks writing the libraries. Whereas in contrast, you simply wouldn't and didn't get such library design in mainstream general purpose programming languages (e.g. in C++, java some of this stuff wouldn't even type check) and similarly in python, even though python being dynamic was fertile ground for people to develop completely bonkers and unautomatable numeric and scientific libraries, the customs for how libraries should work were different This is maybe just a reflection that R and R's libraries were being designed for interactive use by humans doing exploratory data analysis, model fitting etc, unlike other programming languages which are used to automate things or build software products that can be shipped. | | |
| ▲ | kjkjadksj 4 days ago | parent [-] | | I think what people miss about R is that if you go on with an object oriented for loop way of writing code like a lot of python devs tend to do, you are going to have a bad time. You write functional code and make use of various apply functions instead of loops, it’s going to be very performant. A lot of it is wrapping C. |
| |
| ▲ | UpsideDownRide 4 days ago | parent | prev | next [-] | | It's general purpose and really there is no issue with doing production with it really outside of the mindset and the lispy nature of it. Source - was working on R in production for financial sector. | |
| ▲ | mscbuck 4 days ago | parent | prev | next [-] | | This is really a non issue now. R's problem back in the day was that it was really specialized in analysis and interactivity, but a lot of the general purpose stuff that made Python popular is now easily achievable in R and well-developed and maintained. RestRServe and Plumber are both excellent tools for REST APIs. | |
| ▲ | dkga 4 days ago | parent | prev [-] | | Completely disagree. I work at a central bank, helping people make some of the most important economic decisions in my country and plenty of analyses are done purely with R. | | |
| ▲ | esafak 4 days ago | parent | next [-] | | Were they run in production as nightly jobs or something? | |
| ▲ | melenaboija 4 days ago | parent | prev [-] | | It is used in finance and banking to build statistical models for research not for deployments in production in the technical sense, I hope. |
|
|