| ▲ | veltas 4 hours ago |
| From the ANSI C standard: 3.16 undefined behavior: Behavior, upon use of a nonportable or erroneous program construct, of erroneous data, or of indeterminately valued objects, for which this International Standard imposes no requirements. Permissible undefined behavior ranges from ignoring the situation completely with unpredictable results, to behaving during translation or program execution in a documented manner characteristic of the environment (with or without the issuance of a diagnostic message).
Is it just me or did compiler writers apply overly legalistic interpretation to the "no requirements" part in this paragraph? The intent here is extremely clear, that undefined behavior means you're doing something not intended or specified by the language, but that the consequence of this should be somewhat bounded or as expected for the target machine. This is closer to our old school understanding of UB.By 'bounded', this obviously ignores the security consequences of e.g. buffer overflows, but just because UB can be exploited doesn't mean it's appropriate for e.g. the compiler to exploit it too, that clearly violates the intent of this paragraph. |
|
| ▲ | thomashabets2 an hour ago | parent | next [-] |
| Author here. I touched on this in the "it's not about optimizations" section. It's not the compiler is out to get you. It's that you told it to do something it cannot express. It's like if you slipped in a word in French, and not being programmed for French, it misheard the word as a false friend in English. The compiler had no way to represent the French word in it's parse tree. So no, it's not overly legalistic. Like if the compiler knows that this hardware can do unaligned memory access, but not atomic unaligned access, should it check for alignment in std::atomic<int> ptr but not in int ptr? Probably not, right? |
| |
| ▲ | veltas 13 minutes ago | parent [-] | | It's not that your article specifically discusses this aspect, but I think it's an important part of the conversation that's being overlooked by commentators, that we've twisted the original intent of UB and made unnecessary work for ourselves. There's been too much scaremongering about UB that's gone beyond the real concerns. If you only fear UB and don't understand it then you are worse off for trying to write safe C or C++. |
|
|
| ▲ | dataflow 4 hours ago | parent | prev | next [-] |
| > but that the consequence of this should be somewhat bounded or as expected for the target machine. Aren't "unpredictable results" and "no requirements" contrary to the idea that the behavior would be "somewhat bounded"? |
| |
| ▲ | veltas 4 hours ago | parent [-] | | Notice though "ignoring the situation" thru "documented manner characteristic of the environment". Even though truly you can read this in an uncharitable way, you could also try and understand the intent of this paragraph, and I think reading it for its intents is always the best way to interpret a language standard when the wording is ambiguous or soft, especially if you're writing a compiler. I don't think you could sincerely argue that this definition intends to allow the compiler to totally rewrite your code because of one guaranteed UB detected on line 5, just that it would be good to print a diagnostic if it can be detected, and if not to do what's "characteristic of the environment". Does that make sense? | | |
| ▲ | gpderetta 4 hours ago | parent | next [-] | | Ex falso quodlibet. Bounding UB would be a nice idea, or at least prohibiting time-traveling UB (and there is an effort in that direction). But properly specifing it is actually hard. | | |
| ▲ | account42 an hour ago | parent [-] | | Prohibiting "time-travelling" UB would be horrible as that's a very important mechanism for dead code elimination. | | |
| ▲ | dzaima 13 minutes ago | parent [-] | | Even if you forbid "time travel", you can still technically optimize many things as if time travel happened anyway - e.g. want to time-travel back to before some memory store? just pretend that the store happened, but then afterwards the previous value was stored back (and no other threads happen to see the intermediate value)! Only things you need to worry about then are things with actual observable side-effects - volatile, printf and similar - and C23 does note that all observable behavior should happen even if UB follows, and compilers can't generally optimize function calls anyway (e.g. on systems on which you can define custom printf callbacks, you could put an exit(0) in such, and thus make it incorrect to optimize out a printf ever). |
|
| |
| ▲ | cracki 4 hours ago | parent | prev [-] | | Reading for intent is pragmatic. Reading adversarially is what people do who are looking for ways that something can be abused, from an offensive or defensive position. Personally I am tired of the entire topic. | | |
| ▲ | veltas 3 hours ago | parent [-] | | What's bad is when your compiler writers and most of the people involved in standardisation are reading it adversarially. | | |
| ▲ | account42 an hour ago | parent [-] | | It's bad when compiler writers want to optimize correct code as much as possible, which is something their actual customers keep asking for? | | |
| ▲ | veltas 6 minutes ago | parent [-] | | When would optimizing correct code be harmed by not abusing UB (beyond its original intent, e.g. array access should be without overhead of checking for overflow)? |
|
|
|
|
|
|
| ▲ | 1718627440 34 minutes ago | parent | prev | next [-] |
| The behaviour is bounded by the capability of your machine. It is unlikely that your desktop computer launches a nuclear missile, unless you worked for it to be able to do that. |
|
| ▲ | lelanthran 3 hours ago | parent | prev [-] |
| > Is it just me or did compiler writers apply overly legalistic interpretation to the "no requirements" part in this paragraph? I've (fruitlessly) had this discussion on HN before - super-aggressive optimisations for diminishing rewards are the norm in modern compilers. In old C compilers, dereferencing NULL was reliable - the code that dereferenced NULL will always be emitted. Now, dereferencing NULL is not reliable, because the compiler may remove that and the program may fail in ways not anticipated (i.e, no access is attempted to memory location 0). The compiler authors are on the standard, and they tend to push for more cases of UB being added rather than removing what UB there is right now (for exampel, by replacing with Implementation Defined Behaviour). |