Remix.run Logo
skissane 2 days ago

> In some ways, Apple's adherence to UNIX specifications probably makes macOS less useful for you. For example, I wish that grep on macOS was closer to GNU grep. When I look up commands online, I often find answers based on the GNU implementations. Those often work on macOS, but sometimes don't (or have subtly different behavior) because macOS is adhering to the UNIX specification rather than to what those utilities do on the vast majority of systems out there.

UNIX certification is not the reason why macOS utilities are missing options compared to GNU - UNIX standards say you have to have certain options which work a certain way, they don’t prohibit adding additional options as vendor extensions. The reason is that Apple’s investment in improving these tools is minimal because it is a low priority for them, and because people who get annoyed by this often just end up installing the GNU tools anyway (using Homebrew or MacPorts)

In fact, GNU/Linux systems have been certified as UNIX in the past, by a couple of different Chinese vendors (Inspur K-UX, Huawei EulerOS)-which shows use of the GNU tools is no inherent obstacle to certification. The reason these vendors stopped, I suspect, is the money it was making them was smaller than the certification costs and UNIX trademark license fee

jchw 2 days ago | parent | next [-]

Pretty sure GNU coreutils really does intentionally deviate from POSIX compliance in a handful of places, otherwise POSIXLY_CORRECT wouldn't exist. That said you're probably right, though I also suspect dealing with GPL licensing is another major reason they don't bother with things like GNU coreutils. (Obviously they definitely wouldn't have done it after coreutils switched to GPLv3, but I'm sure even before then they would've greatly preferred permissively-licensed software.)

chasil 2 days ago | parent | next [-]

There is some subtlety that you are missing here.

Outside of coreutils, let's consider bash and ksh88.

The two have differing behavior in several areas (coprocesses, alias handling, final pipeline fork, etc.), but this divergence in behavior happened before POSIX.2 and the standardization of the POSIX shell, which is largely a subset of ksh88.

The gist is that activating a mode for POSIX compliance will generally remove functionality, because the standardization happened a decade after development began, and the standards themselves were excessively conservative in adherence to System V.

I've seen that useful GNU extensions are generally adopted by BSD, but much more slowly by POSIX.

That does not serve UNIX well. Someone should challenge the Austin Group for effective control of UNIX standardization.

jchw 2 days ago | parent [-]

AFAIK, enabling POSIXLY_CORRECT doesn't get rid of any functionality. It changes some very subtle behaviors, such as the way certain argument parsing edge cases would be handled.

Anyway, I think this is somewhat a non-issue: even if bash doesn't fully comply with POSIX standards by default, it should still be possible to be POSIX compliant by delivering a compliant shell in the right place. Though this does make me wonder if there's anything in POSIX that would require the user's default login shell to be POSIX-compliant, Bourne shell compatible. Probably not, right? After all, macOS had been using bash for ages with no issues complying.

chasil 2 days ago | parent [-]

Nope nope nope.

You can see this in a script by defining:

  alias p=printf
Then try to use it with bash. If bash is running as #!/bin/sh, then it will work, because bash is forced into POSIX mode.

However, if the script is running as #!/bin/bash, then you will be in the '80s behavior, and it will fail.

Try it.

jchw 2 days ago | parent [-]

Bash isn't part of GNU coreutils.

chasil 2 days ago | parent [-]

I realize that, but I'm illustrating that POSIX.2 required a retrofit to bash, and probably required similar adjustments to the rest of userland, including coreutils.

jchw 2 days ago | parent [-]

I knew about the fact that bash behaves quite differently in POSIX mode, but that isn't much of a problem in most cases since nobody is forcing you to use a POSIX-compatible bourne shell as your login shell or for scripting, it's just the shell that you can guarantee will exist if something is POSIX compliant, right? If I were addressing bash, I would've said set -o posix instead of POSIXLY_CORRECT. (I didn't even realize POSIXLY_CORRECT did anything to bash.)

The GNU bash documentation covers the differences pretty well:

https://www.gnu.org/software/bash/manual/html_node/Bash-POSI...

GNU coreutils however, the behavior differences seem rather minor, and I couldn't find exhaustive documentation. However, I may as well try to back this up with more than conjecture since we're already this deep in the thread. Let's dig into GNU coreutils and see what POSIXLY_CORRECT appears to do as of current git HEAD:

- cp: Allow the destination to be a dangling symlink when POSIXLY_CORRECT is set.

- dd: Does not trap SIGINFO if it's equal to SIGUSR1 (default) and POSIXLY_CORRECT is set. I guess this means that POSIXLY_CORRECT makes the `pkill -USR1 dd` thing not work?

- df: Use 512-byte block size if POSIXLY_CORRECT is set, otherwise 1024.

- echo: POSIXLY_CORRECT disallows parsing options unless the first option is `-n`, and enables parsing "v9"-style interpretation of backslash escapes. Demonstration: `$(which echo) -e \\n`

- id: Will not print SELinux context even when --context is passed. Not sure why. This is the only thing I've seen that explicitly disables functionality.

- nohup: The exit code for internal failures is 127 instead of 125 when POSIXLY_CORRECT is set.

- pr: Changes default date format when POSIXLY_CORRECT is set.

- printf: POSIXLY_CORRECT disables a warning about ignored characters following a character constant. Demonstration: `$(which printf) %x "'xx"` - same output in both modes, but in POSIXLY_CORRECT you are not warned about the second x being ignored.

- pwd: Defaults to using -L ("logical" mode, uses $PWD value as long as it refers to the CWD) instead of -P.

- readlink: Defaults to --verbose if POSIXLY_CORRECT is set.

- sort: Allow operands to be parsed after files if POSIXLY_CORRECT is not set.

- touch: Seems to disable some kind of warning when an invalid date is passed.

- uniq: Seems to be the same as sort.

- wc: Treats non breaking space characters as word delimiters, if POSIXLY_CORRECT is unset.

I believe this is an exhaustive list as of GNU coreutils f4dcc2a495c390296296ad262b5a71996d0f6a86.

chasil 2 days ago | parent [-]

I still run some rhel5, and there were quite a few standard options that were not implemented by GNU.

Looking now is good, but looking in the past is also illuminating.

I generally trust busybox to give me both a uniform and compliant userland, certainly more than rhel5 coreutils.

jchw 2 days ago | parent [-]

I only chose the latest version because I figured it would have the most POSIXLY_CORRECT effects. Documentation seems to confirm this: the NEWS file documents added effects over time, but not removed ones, it seems.

I wouldn't necessarily be surprised if GNU coreutils from RHEL5 is old enough to be missing some options needed to comply with POSIX, or if it complied with older POSIX standards, but I think we're losing track here. GNU coreutils maintains essentially all of its functionality and options when in POSIXLY_CORRECT mode; there's really just a handful of differences and they are mostly around edge cases that few people care about, except insofar as it breaks their scripts, especially since in most cases they'll be using shell builtins instead which don't follow the POSIX behavior anyway.

I am not really arguing for or against GNU coreutils. I currently use GNU coreutils and would prefer GNU-compatible coreutils on my systems purely as a matter of muscle memory, regardless of whether it is in POSIX compatibility mode or not. That said, I don't think GNU coreutils are necessarily anything special, and the utilities that macOS and BusyBox provide are almost always perfectly fine with me, with some minor exceptions. I'm sure the same will be true if I ever try a uutils-based system. The only point to be made here is that at least as far as coreutils go, it doesn't really seem like POSIX compliance is a hindrance. If anything stopped macOS from using GNU coreutils, I suggest it's probably to avoid having more GPL software in macOS (especially post-GPLv3.) Though there could be multiple factors at play.

chasil 2 days ago | parent [-]

And I'll get back to my original point.

When Debian demoted bash and migrated to the Almquist shell, there was great anguish from Ubuntu users of all tiers (Adobe was notable, IIRC).

That anguish was due to a decade that predated POSIX.2.

That was a substantial hindrance.

jchw 2 days ago | parent [-]

I absolutely remember Debian switching to Almquist shell, but that was about more than just POSIX compliance. IIRC a big deal at the time was that the cost of starting and using GNU bash all over the place was actually a measurable performance impact, and switching to dash improved on this. Also, bashisms became pervasive in scripts with /bin/sh hashbangs, which is definitely wrong no matter how you feel about POSIX.

And anyway, this tangent doesn't feel terribly connected to this discussion thread since macOS never had this issue in the first place and this particular discussion thread was never really about UNIX shells...

skissane 2 days ago | parent | prev [-]

> Pretty sure GNU coreutils really does intentionally deviate from POSIX compliance in a handful of places, otherwise POSIXLY_CORRECT wouldn't exist.

To get UNIX certification, you can just patch it to make POSIXLY_CORRECT=1 the default.

Or even don’t patch the utilities, and just patch /etc/profile to set POSIXLY_CORRECT=1 globally.

UNIX certification requires that the system have a mode of operation available which passes the test suite; the existence of config settings which if changed from their defaults produce standards violations is not in itself a standards violation.

jchw 2 days ago | parent [-]

The point is that the default build of GNU coreutils in the default configuration is not POSIX compliant, not that it can't be made to be POSIX compliant. Obviously it can be done, otherwise that environment variable would not exist.

pornel 2 days ago | parent | prev [-]

Apple got spooked by GPL v3 anti-tivoization clauses and stopped updating GNU tools in 2007.

macOS still has a bunch of GNU tools, but they appear to be incompatible with GNU tools used everywhere else, because they're so outdated.

wkat4242 2 days ago | parent [-]

And Apple is doing a lot of Tivoization these days. They're not yet actually stopping apps that they haven't "notarized" but they're not making it easier. One of the many reasons I left the Mac platform, both private and at work. The other reason was more and more reliance on the iCloud platform for new features (many of its services don't work on other OSes like Windows and Linux - I use all those too)

The problem with the old tools is that I don't have admin rights at work so it's not easy to install coreutils. Or even homebrew.

I can understand why they did it though. Too many tools these days advocate just piping some curl into a root shell which is pretty insane. Homebrew does this too.

flocked 2 days ago | parent [-]

Couldn't you simply use macOS without the iCloud features? Which features require iCloud to work?

wkat4242 2 days ago | parent [-]

You can but there's just not much point anymore.

I don't remember all the specifics but every time there was a new macos I could cross most of the new features off. Nope this one requires iCloud or an apple ID. Nope this one only works with other macs or iPhones. Stuff like that. The Mac didn't use to be a walled garden. You can still go outside of their ecosystem (unlike on iOS) but then there's not much point. You're putting a square peg in a round hole.

Now, Apple isn't the only one doing this. Microsoft is making it ever harder to use windows without a Microsoft account. That's why I'm gravitating more and more to foss OSes. But there are new problems now, like with Firefox on Linux I constantly get captcha'd. M365 (work) blocks random features or keeps signing me out. My bank complains my system is not 'trusted'. Euh what about trusting your actual customers instead of a mega corp? I don't want my data locked in or monitored by a commercial party.