Remix.run Logo
justinfrankel 11 hours ago

have multiple macOS machines with 600-1000+ day uptimes, which do TCP connections every minute or so at a minimum, they are still expiring their TIME_WAIT connections as normal.

these kernel versions:

Darwin Kernel Version 20.6.0: Thu Jul 6 22:12:47 PDT 2023; root:xnu-7195.141.49.702.12~1/RELEASE_ARM64_T8101 arm64

Darwin Kernel Version 17.7.0: Wed Apr 24 21:17:24 PDT 2019; root:xnu-4570.71.45~1/RELEASE_X86_64 x86_64

so... wonder what that's about?

justinfrankel 11 hours ago | parent | next [-]

ah reading their analysis, there are errors that explain this. Particularly this:

  tcp_now   = 4,294,960,000  (frozen at pre-overflow value)
  timer     = 4,294,960,000 + 30,000 = 4,294,990,000
              (exceeds uint32 max → wraps to a small number)
timer wraps to a small number, they say

  TSTMP_GEQ(4294960000, 4294990000)
they forgot to wrap it there, it should be TSTMP_GEQ(4294960000, small_number)

  = (int)(4294960000 - 4294990000)
  = (int)(-30000)
  = -30000 >= 0 ?  → false!
wrong!

There may be a short time period where this bug occurs, and if you get enough TCP connections to TIME_WAIT in that period, they could stick around, maybe. But I think the original post is completely overreacting and was probably written by a LLM, lol.

Aloisius 9 hours ago | parent [-]

There does appear to be a bug, but it's not what the blog describes.

If tcp_now stops updating at <= 2^32 - 30000 milliseconds, then TSTMP_GEQ(tcp_now, timer) will always fail since timer is tcp_now + 30000 which won't wrap.

This does look like it is possible since calculate_tcp_clock() which updates tcp_now only runs when there's TCP traffic. So if at 49 days uptime you halted all TCP traffic and waited about a day, tcp_now would be stuck at the value before you halted TCP traffic.

In cases where tcp_now gets stuck at > 2^32 - 30000, it looks like TCP sockets in the TIME_WAIT will end up being closed immediately instead of waiting 30 seconds, which isn't great either.

fingerlocks an hour ago | parent | next [-]

Are you sure?

tcp_now’s maximum cannot physically reach 2^32 because the trailing zeros of that number exceeds the bit width of data type.

Therefore, tcp_now + 30000 will wrap when tcp_now is equal to 2^32 - 3000. Your inequality sign should be strict <, otherwise the result does not follow.

justinfrankel 8 hours ago | parent | prev [-]

yep that makes sense

comex 11 hours ago | parent | prev [-]

The bug was introduced only last year in macOS 26:

https://github.com/apple-oss-distributions/xnu/blame/f6217f8...

plorkyeran 10 hours ago | parent | next [-]

> Apple Community #250867747: macOS Catalina — "New TCP connections can not establish." New connections enter SYN_SENT then immediately close. Existing connections unaffected. Only a reboot fixes it.

This is a weird thing to cite if it's a macOS 26 bug. I quite regularly go over 50 days of uptime without issues so it makes sense for it to be a new bug, and maybe they had different bugs in the past with similar symptoms.

Aloisius 11 hours ago | parent | prev [-]

Interesting. The article mentions complaints on the forums running Catalina, so that must be something else.

js2 10 hours ago | parent | next [-]

As someone who also operates fleets of Macs, for years now, there is no possible way this bug predates macOS 26. If the bug description is correct, it must be a new one.

groby_b 10 hours ago | parent | prev [-]

The article is written using AI, so unless you verified the complaints, the safe default assumption is that they don't exist.

Aloisius 10 hours ago | parent [-]

It definitely exists, but it could be a completely unrelated issue.

https://discussions.apple.com/thread/250867747