Remix.run Logo
jchw 2 days ago

I knew about the fact that bash behaves quite differently in POSIX mode, but that isn't much of a problem in most cases since nobody is forcing you to use a POSIX-compatible bourne shell as your login shell or for scripting, it's just the shell that you can guarantee will exist if something is POSIX compliant, right? If I were addressing bash, I would've said set -o posix instead of POSIXLY_CORRECT. (I didn't even realize POSIXLY_CORRECT did anything to bash.)

The GNU bash documentation covers the differences pretty well:

https://www.gnu.org/software/bash/manual/html_node/Bash-POSI...

GNU coreutils however, the behavior differences seem rather minor, and I couldn't find exhaustive documentation. However, I may as well try to back this up with more than conjecture since we're already this deep in the thread. Let's dig into GNU coreutils and see what POSIXLY_CORRECT appears to do as of current git HEAD:

- cp: Allow the destination to be a dangling symlink when POSIXLY_CORRECT is set.

- dd: Does not trap SIGINFO if it's equal to SIGUSR1 (default) and POSIXLY_CORRECT is set. I guess this means that POSIXLY_CORRECT makes the `pkill -USR1 dd` thing not work?

- df: Use 512-byte block size if POSIXLY_CORRECT is set, otherwise 1024.

- echo: POSIXLY_CORRECT disallows parsing options unless the first option is `-n`, and enables parsing "v9"-style interpretation of backslash escapes. Demonstration: `$(which echo) -e \\n`

- id: Will not print SELinux context even when --context is passed. Not sure why. This is the only thing I've seen that explicitly disables functionality.

- nohup: The exit code for internal failures is 127 instead of 125 when POSIXLY_CORRECT is set.

- pr: Changes default date format when POSIXLY_CORRECT is set.

- printf: POSIXLY_CORRECT disables a warning about ignored characters following a character constant. Demonstration: `$(which printf) %x "'xx"` - same output in both modes, but in POSIXLY_CORRECT you are not warned about the second x being ignored.

- pwd: Defaults to using -L ("logical" mode, uses $PWD value as long as it refers to the CWD) instead of -P.

- readlink: Defaults to --verbose if POSIXLY_CORRECT is set.

- sort: Allow operands to be parsed after files if POSIXLY_CORRECT is not set.

- touch: Seems to disable some kind of warning when an invalid date is passed.

- uniq: Seems to be the same as sort.

- wc: Treats non breaking space characters as word delimiters, if POSIXLY_CORRECT is unset.

I believe this is an exhaustive list as of GNU coreutils f4dcc2a495c390296296ad262b5a71996d0f6a86.

chasil 2 days ago | parent [-]

I still run some rhel5, and there were quite a few standard options that were not implemented by GNU.

Looking now is good, but looking in the past is also illuminating.

I generally trust busybox to give me both a uniform and compliant userland, certainly more than rhel5 coreutils.

jchw 2 days ago | parent [-]

I only chose the latest version because I figured it would have the most POSIXLY_CORRECT effects. Documentation seems to confirm this: the NEWS file documents added effects over time, but not removed ones, it seems.

I wouldn't necessarily be surprised if GNU coreutils from RHEL5 is old enough to be missing some options needed to comply with POSIX, or if it complied with older POSIX standards, but I think we're losing track here. GNU coreutils maintains essentially all of its functionality and options when in POSIXLY_CORRECT mode; there's really just a handful of differences and they are mostly around edge cases that few people care about, except insofar as it breaks their scripts, especially since in most cases they'll be using shell builtins instead which don't follow the POSIX behavior anyway.

I am not really arguing for or against GNU coreutils. I currently use GNU coreutils and would prefer GNU-compatible coreutils on my systems purely as a matter of muscle memory, regardless of whether it is in POSIX compatibility mode or not. That said, I don't think GNU coreutils are necessarily anything special, and the utilities that macOS and BusyBox provide are almost always perfectly fine with me, with some minor exceptions. I'm sure the same will be true if I ever try a uutils-based system. The only point to be made here is that at least as far as coreutils go, it doesn't really seem like POSIX compliance is a hindrance. If anything stopped macOS from using GNU coreutils, I suggest it's probably to avoid having more GPL software in macOS (especially post-GPLv3.) Though there could be multiple factors at play.

chasil 2 days ago | parent [-]

And I'll get back to my original point.

When Debian demoted bash and migrated to the Almquist shell, there was great anguish from Ubuntu users of all tiers (Adobe was notable, IIRC).

That anguish was due to a decade that predated POSIX.2.

That was a substantial hindrance.

jchw 2 days ago | parent [-]

I absolutely remember Debian switching to Almquist shell, but that was about more than just POSIX compliance. IIRC a big deal at the time was that the cost of starting and using GNU bash all over the place was actually a measurable performance impact, and switching to dash improved on this. Also, bashisms became pervasive in scripts with /bin/sh hashbangs, which is definitely wrong no matter how you feel about POSIX.

And anyway, this tangent doesn't feel terribly connected to this discussion thread since macOS never had this issue in the first place and this particular discussion thread was never really about UNIX shells...