Remix.run Logo
adolph 4 hours ago

My go-to for fast and easy parallelization is xargs -P.

  find a-bunch-of-files | xargs -P 10 do-something-with-a-file

       -P max-procs
       --max-procs=max-procs
              Run up to max-procs processes at a time; the default is 1.
              If max-procs is 0, xargs will run as many processes as
              possible at a time.
akdev1l 3 hours ago | parent [-]

note that one should use -print0 and -0 for safety

adolph 2 hours ago | parent [-]

Thanks! I've been using the -F{} do-something-tofile "{}" approach which is also handy for times in which the input is one pram among others. -0 is much faster.

Edit: Looks like when doing file-by-file -F{} is still needed:

  # find tmp -type f | xargs -0 ls
  ls: cannot access 'tmp/b file.md'$'\n''tmp/a file.md'$'\n''tmp/c file.md'$'\n': No such file or directory
elteto 2 hours ago | parent | next [-]

You have to do `find ... -print0` so find also uses \0 as the separator.

akdev1l an hour ago | parent | prev [-]

find -print0 will print the files with null bytes as separators

xargs -0 will use a null byte as separator for each argument

printf 'a\0b\0c\0' | xargs -tI{} echo “file -> {}"