Remix.run Logo
yjftsjthsd-h 6 hours ago

> No bytecode compilation by default. pip compiles .py files to .pyc during installation. uv skips this step, shaving time off every install. You can opt in if you want it.

Are we losing out on performance of the actual installed thing, then? (I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?)

woodruffw 6 hours ago | parent | next [-]

No, because Python itself will generate bytecode for packages once you actually import them. uv just defers that to first-import time, but the cost is amortized in any setting where imports are performed over multiple executions.

yjftsjthsd-h 6 hours ago | parent [-]

That sounds like yes? Instead of doing it once at install time, it's done once at first use. It's only once so it's not persistently slower, but that is a perf hit.

My first cynical instinct is to say that this is uv making itself look better by deferring the costs to the application, but it's probably a good trade-off if any significant percentage of the files being compiled might not be used ever so the overall cost is lower if you defer to run time.

VorpalWay 3 hours ago | parent | next [-]

I think they are making the bet that most modules won't be imported. For example if I install scipy, numpy, Pillow or such: what are the chances that I use a subset of the modules vs literally all of them?

I would bet on a subset for pretty much any non-trivial package (i.e. larger than one or two user facing modules). And for those trivial packages? Well they are usually small, so the cost is small as well. I'm sure there are exceptions: maybe a single gargantuan module thst consists of autogenerated FFI bindings for some C library or such, but that is likely the minority.

woodruffw 5 hours ago | parent | prev | next [-]

> It's only once so it's not persistently slower, but that is a perf hit.

Sure, but you pay that hit either way. Real-world performance is always usage based: the assumption that uv makes is that people run (i.e. import) packages more often than they install them, so amortizing at the point of the import machinery is better for the mean user.

(This assumption is not universal, naturally!)

dddgghhbbfblk 5 hours ago | parent [-]

Ummm, your comment is backwards, right?

woodruffw 5 hours ago | parent [-]

Which part? The assumption is that when you `$TOOL install $PACKAGE`, you run (i.e. import) `$PACKAGE` more than you re-install it. So there's no point in slowing down (relatively less common) installation events when you can pay the cost once on import.

(The key part being that 'less common' doesn't mean a non-trivial amount of time.)

dddgghhbbfblk 3 hours ago | parent [-]

Why would you want to slow down the more common thing instead of the less common thing? I'm not following that at all. That's why I asked if that's backwards.

beacon294 5 hours ago | parent | prev | next [-]

Probably for any case where an actual human is doing it. On an image you obviously want to do it at bake time, so I feel default off with a flag would have been a better design decision for pip.

I just read the thread and use Python, I can't comment on the % speedup attributed to uv that comes from this optimization.

Epa095 5 hours ago | parent [-]

Images are a good example where doing it at install-time is probably the best yeah, since every run of the image starts 'fresh', losing the compilation which happened last time the image got started.

If it was a optional toggle it would probably become best practice to activate compilation in dockerfiles.

tedivm 4 hours ago | parent | prev | next [-]

You can change it to compile the bytecode on install with a simple environment variable (which you should do when building docker containers if you want to sacrifice some disk space to decrease initial startup time for your app).

saidnooneever 5 hours ago | parent | prev [-]

you are right. it depends on how often this first start is, if its bad or not..most usecases id guess (total guess, have limited exp with python projects professionally) its not an issue.

hauntsaninja 4 hours ago | parent | prev | next [-]

Yes, uv skipping this step is a one time significant hit to start up time. E.g. if you're building a Dockerfile I'd recommend setting `--compile-bytecode` / `UV_COMPILE_BYTECODE`

zahlman 3 hours ago | parent | prev | next [-]

> I'm not 100% clear on .pyc files TBH; I'm guessing they speed up start time?

They do.

> Are we losing out on performance of the actual installed thing, then?

When you consciously precompile Python source files, you can parallelize that process. When you `import` from a `.py` file, you only get that benefit if you somehow coincidentally were already set up for `multiprocessing` and happened to have your workers trying to `import` different files at the same time.

thundergolfer 4 hours ago | parent | prev | next [-]

This optimization hits serverless Python the worst. At Modal we ensure users of uv are setting UV_COMPILE_BYTECODE to avoid the cold start penalty. For large projects .pyc compilation can take hundreds of milliseconds.

salviati 5 hours ago | parent | prev | next [-]

Historically the practice of producing pyc files on install started with system wide installed packages, I believe, when the user running the program might lack privileges to write them. If the installer can write the .oy files it can also write the .pyc, while the user running them might not in that location.

plorkyeran 5 hours ago | parent | prev [-]

If you have a dependency graph large enough for this to be relevant, it almost certainly includes a large number of files which are never actually imported. At worst the hit to startup time will be equal to the install time saved, and in most cases it'll be a lot smaller.