This is a common misconception (or poor way of phrasing it, sorry). Compiler implementers don't go looking for instances of undefined behavior in a program with the goal of optimizing it in some way. There is little value in optimizing invalid code. The opposite is the case.
But we must write code that relies on the same rules and requirements that programs are held to (and vice versa). When either party breaks those rules, either accidentally or deliberately, bad things happen.
What sometimes happens is that code written years or decades ago relies on the absence of an explicit guarantee in the language suddenly stops working because a compiler change depends on the assumption that code doesn't rely on the absence of the guarantee. That can happen as a result of improving optimizations, which is often but not not necessarily always motivated by improving the efficiency of programs. Better analysis can also help find bugs in code or avoid issuing warnings for safe code.
The fact that the Standard does not impose requirements upon how a piece of code behaves implies that the code is not strictly conforming, but the notion that it is "invalid" runs directly contrary to the intentions of the C89 and C99 Standards Committees, as documented in the published C99 Rationale. That document recognizes Undefined Behavior as, among other things, "identifying avenues of conforming language extension". Code that relies upon such extensions may be non-portable, but the authors of the Standard have expressly said that they did not wish to demean useful programs that happen to be non-portable.
There are rules and requirements documented in the spec, and there are de-facto rules and requirements that programs expect. Not only that, but when they do exploit these rules, often the code generated is obviously incorrect, and could have been flagged at compile time.
Right now, it seems like compiler vendors are playing a game of chicken with their users.
I think the issue is that many of these "obviously incorrect" things are not obvious at the level that the optimizations are taking place. Perhaps it would be worth considering adding higher-level passes in the compiler that can detect these kinds of surprising changes and warn about them.
Well, no, the issue is that the compiler writers refuse to acknowledge the these obviously incorrect things are incorrect in the first place and tend to blame users for tripping over compiler bugs. If it were just that they didn't know how to fix said bugs, that would be a qualitatively different and much less severe problem.
> not obvious at the level that the optimizations are taking place
Hmm...then it's up to the optimisers to up their game.
Optimisation is supposed to be behaviour-preserving. Arguing that almost all real-world programs invoke UB and therefore don't have well-defined behaviour (by the standard as currently interpreted) is a bit of a cop-out.
> This is a common misconception (or poor way of phrasing it, sorry). Compiler implementers don't go looking for instances of undefined behavior in a program with the goal of optimizing it in some way. There is little value in optimizing invalid code. The opposite is the case.
Compilers do deliberately look to optimize loops with signed counters by exploiting UB to assume that they will never wrap.
Compiler implementers are happy when they don't have to care about some edge case because then the code is simpler. Thus, only for unsigned counters there is the extra logic to compile them correctly.
That is my interpretation of "The opposite is the case". Writing a compiler is easier with lots of undefined behavior.
But that's backwards, the compiler writers are writing special cases to erase checks in the signed case. Doing the 'dumb' thing and mindlessly going through the written check is simpler which is why that's what compilers did for decades as de facto standard on x86.
The dump thing is a non optimizing compiler. GCC and LLVM contain many optimization phases. It is probably some normal optimization which is only "wrong" in the context of loop conditions.
Well yes, they assume they never wrap because that is not allowed by the language, by definition.
UB are the results of broken preconditions at the language level.
But we must write code that relies on the same rules and requirements that programs are held to (and vice versa). When either party breaks those rules, either accidentally or deliberately, bad things happen.
What sometimes happens is that code written years or decades ago relies on the absence of an explicit guarantee in the language suddenly stops working because a compiler change depends on the assumption that code doesn't rely on the absence of the guarantee. That can happen as a result of improving optimizations, which is often but not not necessarily always motivated by improving the efficiency of programs. Better analysis can also help find bugs in code or avoid issuing warnings for safe code.