Remix.run Logo
NoboruWataya 8 hours ago

I don't hear nearly as much about Julia as I used to. A few years ago the view was that it was about to replace Python as the language of choice for data science. Seems like that didn't happen?

simondanisch 8 hours ago | parent | next [-]

I think the hype has slowed down, but all growth statistics haven't. Personally, I think Julia is the only language where I can implement something like Makie without running into a maintenance nightmare, and with Julia GPU programming is actually fun and high level and composes well, which I miss in most other languages. So, I dont really care about it replacing python or not. I do think for replacing python Julia will need to solve compilation latency, shipping AOT binaries and maybe interpret more of the glue code, which currently introduces quite a lot of compilation overhead without much gains in terms of performance.

electroly 7 hours ago | parent | prev | next [-]

I don't know about everyone else, but slow Julia compilation continues to cause me ongoing suffering to this day. I don't think they're ever going to "fix" this. On a standard GitHub Actions Windows worker, installing the public Julia packages I use, precompiling, and compiling the sysimage takes over an hour. That's not an exaggeration. I had to juice the worker up to a custom 4x sized worker to get the wall clock time to something reasonable.

It took me days to get that build to work; doing this compilation once in CI so you don't have to do it on every machine is trickier than it sounds in Julia. The "obvious" way (install packages in Docker, run container on target machine) does not work because Julia wants to see exactly the same machine that it was precompiled on. It ends up precompiling again every time you run the container on other machines. I nearly shed a tear the first time I got Julia not to precompile everything again on a new machine.

R and Python are done in five minutes on the standard worker and it was easy; it's just the amount of time it takes to download and extract the prebuilt binaries. Do that inside a Docker container and it's portable as expected. I maintain Linux and Windows environments for the three languages and Julia causes me the most headaches, by far. I absolutely do not care about the tiny improvement in performance from compiling for my particular microarch; I would opt into prebuilt x86_64 generic binaries if Julia had them. I'm very happy to take R's and Python's prebuilt binaries.

vchuravy 3 hours ago | parent | next [-]

I am very interested in improving the user-experience around precompilation and performance, may I ask why you are creating a sysimage from scratch?

> I would opt into prebuilt x86_64 generic binaries if Julia had them

The environment varial JULIA_CPU_TARGET [1] is what you are looking for, it controls what micro-architecture Julia emits for and supports multi-versioning.

As an example Julia is built with [2]: generic;sandybridge,-xsaveopt,clone_all;haswell,-rdrnd,base(1)

[1] https://docs.julialang.org/en/v1/manual/environment-variable...

[2] https://github.com/JuliaCI/julia-buildkite/blob/9c9f7d324c94...

electroly 3 hours ago | parent [-]

I have a monorepo full of Julia analysis scripts written by different people. I want to run them in a Docker container on ephemeral Linux EC2 instances and on user Windows workstations. I don't want to sit through precompilation of all dependencies whenever a new machine runs a particular version of the Julia project for the first time because it takes a truly remarkable amount of time. For the ephemeral Linux instances running Julia in Docker, that happens on every run. Precompiling at Docker build time doesn't help you; it precompiles everything again when you run that container on a different host computer. R and Python don't work like this; if you install everything during the Docker image build, they will not suddenly trigger a lengthy recompilation when run on a different host machine.

I am intimately familiar with JULIA_CPU_TARGET; it's part of configuring PackageCompiler and I had to spend a fair amount of time figuring it out. Mine is [0]. It's not related to what I was discussing there. I am looking for Julia to operate a package manager service like R's CRAN/Posit PPM or Python's PyPI/Conda that distributes compiled binaries for supported platforms. JuliaHub only distributes source code.

[0] generic;skylake-avx512,clone_all;cascadelake,clone_all;icelake-server,clone_all;sapphirerapids,clone_all;znver4,clone_all;znver2,clone_all

JanisErdmanis 4 hours ago | parent | prev [-]

> It took me days to get that build to work; doing this compilation once in CI so you don't have to do it on every machine is trickier than it sounds in Julia

You may be interested in looking into AppBundler. Apart from the full application packaging it also offers ability to make Julia image bundles. While offering sysimage compilation option it also enables to bundle an application via compiled pkgimages which requires less RAM and is much faster to compile.

badlibrarian 6 hours ago | parent | prev | next [-]

Versus Python, it seems to fork into the "thinkers" vs "doers" camp. Julia provides a level of abstraction that some people find comforting. I thought I could use it as a sort of open source Matlab for a lot of thinky, 1-based index code I had lying around. It didn't meet my needs. And "spend half an hour waiting for a Jupyter notebook to boot up" is real. Great for some but it's not compatible with the way I work.

Elsewhere someone used the term "janky" and perhaps it's the fact that there are so many incredibly smart people around it that makes it so janky. By way of example, somebody needed to check disk space and the architect told him to shell out to Python.

Remember when LLVM first came out and it got kudos for the quality of its error messages? Well if you miss the old-school 1980s GCC experience the nonsense that eventually comes out of the Julia compiler after an hour will relight that flame.

Want to use greek letters and other symbols that don't appear on your keyboard as variable names? You've found your people.

bobajeff 7 hours ago | parent | prev | next [-]

As someone who currently uses dabbles in both. That prediction seems a bit unrealistic. Julia is a fantastic language but it has some trade offs that need to be considered. Probably the most well known is `time to first x`. Julia like Python is used comfortably in notebooks but loading libraries can take a minute, compared to Python where it happens right away. It may lead you to not reach for it when you want to do quick testing of something especially plotting. You can mitigate this somewhat by loading all the libraries you'll ever need at startup (preferably long before you are ready to experiment) but that assumes you already know what libraries you'll need for what you're wanting to try.

simondanisch 7 hours ago | parent [-]

What prediction? Maybe I need to rephrase what I said: My prediction is, that if Julia ever wants to have a shot at replacing Python, it absolutely has to solve the first time to first x problem! That's what I mean by shipping fully ahead of time compiled binaries and interpreting more glue code - which both have the potential to solve the first time to x problem.

bobajeff 7 hours ago | parent [-]

The prediction I was referring to was the one in the parent comment. (The one I was commenting under)

simondanisch 7 hours ago | parent [-]

Ah sorry :D

Rijanhastwoears 3 hours ago | parent | prev | next [-]

Julia is great ... if you are willing to work with the Goldilocks zone it provides.

I think what happened is this: Julia got advertised as "Python syntax, C speed" but in practice it turns out to really be "Python syntax, 50% of C speed if you were willing to avoid some semi-well-documented gotchas, where avoiding said gotchas will take some non-trivial effort". Again, great if you are willing to work with it.

I am not saying that the Julia people are responsible for the "Python syntax, C speed" perception as much as that was what the prevalent perception became. And

I have talked to people in computational biology who tried Julia, and they said something or the other similar to "It just wasn't performant enough for me to give up Python," and if you really dig in, what really happened was when new people tried Julia with old mental models, they walked away thinking, "Heh, more MIT hypeware."

simondanisch 2 hours ago | parent | next [-]

well I've been reaching 100% of c Speed Most of the time which feels like an easy effort... I guess it depends on the problem a bit and how used you're to writing optimized, clean Julia code

leephillips 3 hours ago | parent | prev [-]

Polyglot Jet Finding:

https://arxiv.org/abs/2309.17309

This paper in experimental high-energy physics is a good example of why Julia is popular for scientific calculations.

It shows that #julialang is over 100 times faster than Python and even faster than C++.

Rijanhastwoears 2 hours ago | parent [-]

So, my original comment really boils down to the idea that "public perception has nothing to do with objective stats". To which your response is ... citing a paper at me.

To reiterate, citing studies that show that smoking causes cancer in chain smokers does ... nothing. You are citing studies, but I am not the chain smoker; I am just the guy talking about chain smokers.

One more time, I wish we lived in a world where public perception was swayed by objective studies, but we don't.

Julia is fast, yes, but when a university sys-admin rolls their eyes at hearing its name, you have lost the battle for well and good.

ssivark 2 hours ago | parent | prev | next [-]

Ugh, this almost feels like flame-bait. This question invariably leads to a lot of bike-shedding around comments from people who feel strongly about some choices in the Julia language (1-based indexing and what not), and the fact that Julia is still not as polished as some other languages in certain aspects of developer experience.

"Data science" is an extremely broad term, so YMMV. That said, since you asked, Julia has absolutely replaced Python for me. I don't have anything new to add on the benefits of Julia; it's all been said before elsewhere. It's just a question of exactly what kind of stuff you want to do. Most of my recent work is math/algorithms flavored, and Python would be annoyingly verbose/inexpressive while also being substantially slower. Julia also tends to have many more high-quality packages of this kind that I can quickly use / build on.

IshKebab 7 hours ago | parent | prev [-]

IMO it just had too many rough edges. Very slow compilation, correctness issues (https://yuri.is/not-julia/), kinda janky tooling (not nearly as bad as pip tbf). Even basic language mistakes like implicit variable declaration and 1-based indexing (in 2012??).

Yes 1-based indexing is a mistake. It leads to significantly less elegant code - especially for generic code - and is no harder to understand than 1-based indexing for people capable of programming. Fight me.

TimorousBestie 16 minutes ago | parent | next [-]

Analogous to “time to first plot”, Julia metacommentary now has time to first “Why I no longer. . .” repost.

bouchard 6 hours ago | parent | prev | next [-]

> Yes 1-based indexing is a mistake. It leads to significantly less elegant code - especially for generic code - and is no harder to understand than 1-based indexing for people capable of programming.

Some would argue that 0-based indexing is significantly less elegant for numerical/scientific code, but that depends on whether they come from a MATLAB/Fortran or Python/C(++) background.

A decision was made to target the MATLAB/Fortran (and unhappy? Python/C++) crowd first, thus the choice of 1-based indexing and column-major order, but at the end of the day it's a matter of personal preference.

0-based indexing would have made it easier to reach a larger audience, however.

> and is no harder to understand than 1-based indexing for people capable of programming.

The same could be said the other way around ;-)

leephillips 5 hours ago | parent [-]

Aside from the fact that 1-based indexing is better for scientific code (see Fortran), I don’t think that it matters very often. I don’t think that any Julia program I’ve ever written would need to change if Julia adopted 0-based tomorrow. You don’t typically write C-style loops in Julia; you use array functions and operators, and if you need to iterate you write `for i in array ...`. If you really need the first or last element you write `a[begin]` or `a[end]`.

IshKebab 5 hours ago | parent [-]

> the fact that 1-based indexing is better for scientific code (see Fortran)

It really isn't. "Scientific code" isn't some separate thing.

The only way it can help is if you're trying to write code that matches equations in a paper that uses 1-based indexing. But that very minor advantage doesn't outweigh the disadvantages by a wide margin. Lean doesn't make this silly mistake.

> If you really need the first or last element

What if you need the Nth block of M elements? The number of times I've written arr[(n-1)m+1:nm] in MATLAB... I do not know how anyone can prefer that nonsense to e.g. nm..<(n+1)m

Certhas 2 hours ago | parent [-]

What if I want the nth element up to the math element? arr[n:m]. And if I want to split the array into two parts, one until the nth element and the other from the m+1st element arr[1:m] and arr[(m+1):end]. Julia matches how people speak about arrays, including C programmers in their comments. Arrays are (conceptually) not pointer arithmetic. Also for your usecase typically you would just use a 2d array and write a[n,:].

IshKebab an hour ago | parent [-]

> arr[n:m]

arr[n..=m]

> arr[1:m] and arr[(m+1):end]

arr[0..m], arr[m..]

Much nicer.

> Arrays are (conceptually) not pointer arithmetic.

Look at a ruler. Does it start at 1?

simondanisch 7 hours ago | parent | prev [-]

lol. There's not much to fight since its a very personal problem how you want to write code. It's evident that all the capable programmers in the Julia community, have found satisfactory ways to get around it, so if you haven't yet, I don't see how that's a Julia problem ;) I can only say I haven't had a single problem with one based indexing in 12 years of developing Julia code. I also haven't run into many correctness issues compared to other languages I've been using. I think Yuri also has been using lots of packages which haven't been very mature. How on earth can you compare a 10 years old library with lots of maintainers with packages created in one year by one person? That's at least what Yuri's critic boils down to me.

Certhas 2 hours ago | parent [-]

I disagree. Julia has correctnes issues because it chose maximum composability over specifying interfaces explicitly. And those are not just in immature packages but also in complex packages. Compared to other languages, Julia has no facilities to help structure large complex code bases. And this also leads to bad error messages and bad documentation.

Recently we got the public keyword, but even the PR there says:

"NOTE: This PR is not a complete solution to the "public interfaces are typically not well specified in Julia" problem. We would need to implement much than this to get to that point. Work on that problem is ongoing in Base and packages and contributions are welcome."