Remix.run Logo
nick238 3 hours ago

Is launching an ffmpeg process so heavyweight that there's a reason to avoid it? If anything, it feels like it would trivialize parallelism, which is probably a feature, not a bug, if you have a bunch of videos to go through.

zahlman 2 hours ago | parent | next [-]

TFA claims:

> This is shorter than the ffprobe code, and faster too – testing locally, this is about 3.5× faster than spawning an ffprobe process per file.

And the calls to the MediaInfo wrapper are not really harder to parallelize. `subprocess.check_output` is synchronous, so that code would have to be adapted to spawn in a loop and then collect the results in a queue or something. With the wrapper you basically end up doing the same thing, but with `multiprocessing` instead. And you can then just reuse a few worker processes for the entire job.

01HNNWZ0MV43FF 2 hours ago | parent | prev [-]

Python must have libav bindings somewhere, you could certainly run that check in-process.

Off the top of my head, it's probably in the container metadata, so you'd just need libavformat and not even libavcodec. Pass it a path, open it, scan the list of streams and check the codec magic number?