| ▲ | whyenot 15 hours ago |
| What makes Python a great language for data science, is that so many people are familiar with it, and that it is an easy language to read. If you use a more obscure language like Clojure, Common Lisp, Julia, etc., many people will not be familiar with the language and unable to read or review your code. Peer review is fundamental to the scientific endeavor. If you only optimize on what is the best language for the task, there are clearly better languages than Python. If you optimize on what is best for science then I think it is hard not to argue that Python (and R) are the best choices. In science, just getting things done is not enough. Other people need to be able to read and understand what you are doing. BTW AI is not helping and in fact is leading to a generation of scientists who know how to write prompts, but do not understand the code those prompts generate or have the ability to peer review it. |
|
| ▲ | iLemming 13 hours ago | parent | next [-] |
| I can't speak for Julia - never used it; never used Common Lisp for analyzing data (I don't think it's very "data-oriented" for the modern age and the shape of data), but Clojure is really not "obscure" - it only looks weird for the first fifteen minutes or so; once you start using it - it is one of the most straightforward and reasonable languages out there - it is in fact simpler than Python and Javascript. Immutable-by-default makes it far much easier to reason about the code. And OMG, it is so much more data-oriented - it's crazy that more people don't use it. Most never even heard about it. |
| |
| ▲ | 7thaccount 9 hours ago | parent | next [-] | | I tried to get into Clojure, but a lot of the JVM hosted languages require some Java experience. Same thing with Scala and Kotlin or F# on .NET. The early tooling was also pretty dependent on Vim or Emacs. Maybe it's all easier now with VSCode or something like that. | | |
| ▲ | geokon 4 hours ago | parent | next [-] | | It doesn't require any Java but the docs do at times sort of assume you understand the JVM to some extent - which was a bit frustrating when first learning the language. It'll use terms like "classpath" without explaining what that is. However nowadays with LLMs these are insignificant speedbumps. If you want to use Java you also don't really need to know Java beyond "you create instances of classes and call methods on them". I really don't want to learn a dinosaur like Java, but having access to the universe of Java libs has saved me many times. It's super fun and nice to use and poke around mature Java libs interactively with a REPL :) All that said I'd have no idea how to write even a helloworld in Java PS: Agreed on Emacs. I love Emacs.. but it's for turbo nerds. Having to learn Emacs and Clojure in parallel was a crazy barrier. (and no, Emacs is not as easy people make it out to be) | |
| ▲ | iLemming 8 hours ago | parent | prev [-] | | None of this even remotely true. I've gotten into Clojure without knowing jackshit about Java, almost ten years later, after tons of things successfully built and deployed, still don't know jackshit about Java. Mia, co-host of 'Clojure apropos' podcast was my colleague, we've worked together on multiple teams, she learned Clojure as her very first PL. Later she tried learning some Java and she was shocked how impossibly weird it looked compared to Clojure. Besides, you can use Clojure without any JVM - e.g., with nbb. I use it for things like browser automation with Playwright. The tooling story is also very solid - I use Emacs, but many of my friends and colleagues use IntelliJ, Vim, Sublime and VSCode, and some of them migrated to it from Atom. | | |
| ▲ | 7thaccount 6 hours ago | parent [-] | | It might not be a problem for you, but it has been for many. I did start by reading through 3 Clojure books. The repl and the basic stuff like using lists is all easy of course, but the tooling was pretty poor compared to what I was used to (I like lisp, but Emacs is a commitment). Also, a lot of tutorials at the time definitely assumed java familiarity, especially with debugging java stack traces. |
|
| |
| ▲ | MarsIronPI 10 hours ago | parent | prev [-] | | Common Lisp fan here, but not a data scientist. Why do you say to avoid CL for data analysis? Not trying to flame or anything, just curious about your experience with it. | | |
| ▲ | iLemming 8 hours ago | parent [-] | | I don't have great experience of using CL for analyzing data, because of "why?", if I already have another Lisp that is simply amazing for data. Clojure, unlike lists in traditional Lisps, based on composable, unified abstraction for its collections, they are lazy by default and literal readable data structures, they are far easier to introspect and not so "opaque" compared to anything - not just CL (even Python), they are superb for dealing with heterogeneous data. Clojure's cohesive data manipulation story is where Common Lisp's lists-and-symbols just can't match. | | |
| ▲ | dreamcompiler 5 hours ago | parent [-] | | Homework assignments notwithstanding, very few serious Common Lisp programs use lists and symbols as their primary data structures. This has been true since around 1985. Common Lisp has O[1] vectors, multidimensional arrays, hash-tables (what Clojure calls maps), structs, and objects. It has set operations too but it doesn't enforce membership uniqueness. It also has bignums, several sizes of floats, infinite-precision rationals, and complex numbers. Not to mention characters, strings, and logical operations on individual bits. The main difference from Clojure is that CL data structures are not immutable. But that's an orthogonal issue to the suggestion that CL doesn't contain a rich library of modern data structures. Common Lisp has never been limited to "List Processing." |
|
|
|
|
| ▲ | flexagoon 2 hours ago | parent | prev | next [-] |
| That's ok, I don't think anyone knows how to properly write Julia. After using it for a while and following the community (watching talks, checking the forum etc), I don't think it has a concept of code quality. You just throw random code at the wall until it starts working. Which makes sense, considering most of the users are scientists. |
|
| ▲ | 2 hours ago | parent | prev | next [-] |
| [deleted] |
|
| ▲ | hekkle 10 hours ago | parent | prev | next [-] |
| > What makes Python a great language for data science, is that so many people are familiar with it While I agree with you in principal this also leads to what I call the "VB Effect". Back in the day VB was taught at every school as part of the standard curriculum. This made every kid a 'computer wizz'. I have had to fix many a legacy codebase that was started by someone's nephew the whizz kid. |
|
| ▲ | aethor 8 hours ago | parent | prev [-] |
| Peer review is fundamental to scientific endeavor but... in ML fields, reviewers almost never check the code and Python package management is hardly reproducible. So clearly we are not there, Python or not. |