Can you find a real example of “time travel” caused by undefined behaviour? [closed]

I am curious. Does anyone know a non-hypothetical C or C++ counterexample to the myth that “the impact of the undefined behaviour is limited to code which runs after the line with undefined behaviour”? In other words I want to see an example of undefined behaviour “travelling back in time” and causing an externally visible effect. By non-hypothetical I mean an example of a real, production-grade compiler (preferably a recent-ish version too) violating this assumption.

I want to see an example where something is printed to stdout (or stderr) that would not be printed if the myth were true or where something is not printed that would be or where something is printed, but the text is wrong. Why the insistence on printing to stdout? I have seen some examples that use writing to a volatile variable as an “externally observable” side effect. I am not very convinced by those examples because I do not see how I would be able to actually observe those writes in the common case of a regular process in user-space (i.e not the OS kernel, device driver or embedded).

Edit: Yakk – Adam Nevraumont pointed out that writing to stdout is done via library calls that the compiler can not reason about. It occurred to me that there exists another form of interprocess communication that I think the compiler might be able to reason about – atomic writes to memory shared with another process. Unfortunately it sounds like interprocess communication by shared-memory is not codified in the normative parts of the C++ standard

The printing has to be done by code that is executed before the line responsible for undefined behaviour. I am not interested in examples of wrong output being printed after the line with undefined behaviour has already been executed. Note: since it does not make sense to talk about execution of a program whose behaviour is undefined, please insert “if the behaviour were not undefined” where appropriate when interpreting this paragraph.

Edit: I found a better way to word this: I am asking for an example that would demonstrate a difference between the real undefined behaviour “all bets are off as soon as the program gets an input for which the behaviour of the program is undefined” and a fictional version of undefined behaviour “the program executes normally until it hits the line that causes undefined behaviour and after that all bets are off”. I want to see an example of the code that would have executed normally if the myth were true doing something strange instead. Code that goes “after” the bad line misbehaving in some way can be explained by nasal daemons under both the correct and the incorrect mental model of undefined behaviour and is thus not a counterexample.

Here is the closest I was able to find:

void bar (void);

int a;

void foo3 (unsigned y, unsigned z)
{
    bar();
    a = y%z;
}

According to the blog linked above some (unknown) version of clang compiles this to:

foo3:
  pushl   %ebp
  movl    %esp, %ebp
  pushl   %esi
  subl    $4, %esp
  movl    8(%ebp), %eax
  xorl    %edx, %edx
  divl    12(%ebp)         <-- might crash the program
  movl    %edx, %esi
  call    bar              <-- might be side effecting
  movl    %esi, a
  addl    $4, %esp
  popl    %esi
  popl    %ebp
  ret

I was able to replicate a similar assembly output in clang 3.3 on godbolt, but not on more recent versions of clang.

I believe this example to be a compiler bug. Clang appears to assume that bar() is guaranteed to return, but C has exit() and longjump() – the code after the call to bar() may be dead code. The behaviour of a program is undefined only if the execution state that does a bad thing is reachable.

Rules:

  1. Include a godbolt link or an alternative means of specifying the CPU architecture, OS, compiler version, standard library version and other technical details required to enable others to reproduce your results
  2. If your example does not use stdout or stderr despite the above request, please explain how I can observe the “externally observable” effect your example uses to demonstrate “time travel” without using a virtual machine, running your code on bare metal, attaching a debugger or using a disassembler.
  3. Flush the output buffers. I do not find missing output caused by data being stuck in a buffer when the program crashes convincing
  4. No hypothetical examples. I am not asking about what the C standard allows the compiler to do or what it could plausibly do. I would like to see a real example of a compiler actually doing this
  5. No compiler bugs
  6. Only C or C++

23

Here we go:

__attribute__((noinline))
void foo(int* data) {
  bool bsilly = false;
  if (data) bsilly=true;
  if (bsilly) {
    std::cerr << "This should not printn";
  } else {
    *data = 7;
  }
}

int main() {
  foo(nullptr);
}

The output we get is:

This should not print

with no segfault. We passed in nullptr, which should lead to bsilly remaining false and UB from *nullptr = 7. Instead, the compiler generates code that prints to cerr and returns despite data being nullptr.

The use of __attribute__((noinline)) is to make it act like a function in a different translation unit without whole program optimization. A dynamic library should work similar.

The “undefined behavior” in dereferencing a null pointer “caused” text to print, despite it never running. And if you comment out the *data=7; line, it no longer prints "This should not printn" when passed nullptr.

I mean, technically this isn’t time travel, as *data=7 happens elsewhen in an alternative universe, not later.

In a “lie told to children”, my guess is that the compiler deduces as follows:

If data is null, then we run UB. Thus either data is nullptr (and we can do anything), or data is not nullptr. Conclusion: assume data is not nullptr, thus bsilly=true occurs, which means we print to std::cerr.

The compiler used to demonstrate this was the clang-trunk x86/64 on godbolt; it is the development branch after 18.1 (leading to 19.0 possibly) based on the list of compilers.

I wrote it with the “elsewhen” technique to avoid the compiler not “knowing” what the IO code does; if IO code can run arbitrary operations, then reasoning about code after the IO code determining the state before becomes impossible. So what I did was made the “undefined behavior” branch not run the IO code, and be fully able to locally reasoned about, thus causing “undefined behavior time travel”.

You can modify the code so that it segfaults after printing if you prefer. There are a number of ways, but you still need to pull off tricks, as a naked *data=7; won’t trigger the branch analysis needed for the time travel.

33

This wouldn’t be a discussion about time traveling without time machines:

#include <stdio.h>
#include <stdlib.h>

typedef int (*FluxCapacitor)();

static FluxCapacitor TimeMachine;

static int DeLorean() {
    int year = 1955;
    return printf("Marty in %d.n", year);
}

void TemporalParadox() {
    TimeMachine = DeLorean; // This is ok, no one calls this function.
    abort();                // But let's be sure.
}

int main() {
    TimeMachine();
    return 0;
}

Compiling and running the code on:

  • Clang x86-64 trunk with -O3;
  • icc x86-64 10.0 with -O3;
  • icx x86-64 latest with -O3;
  • zig cc trunk with -O3;

all generates the same output:

Marty in 1955.

How? DeLorean() is never called, and TemporalParadox() is dead code.

Let’s examine the assembly code from one the links above:

TemporalParadox:                        # @TemporalParadox
        push    rax
        call    abort@PLT
main:                                   # @main
        push    rax
        lea     rdi, [rip + .L.str]
        mov     esi, 1955
        xor     eax, eax
        call    printf@PLT
        xor     eax, eax
        pop     rcx
        ret
.L.str:
        .asciz  "Marty in %d.n"

In other words, the main() got statically replaced by a function that is unreachable.

Again, how?

An optimizing compiler is allowed to shuffle things around, but without violating any defined behaviour. But the code above already has UB (this is a given in any answer to this question), so I choose the example above where an optimizing compiler shuffles around the UB itself, to the effect of time traveling a line to before the start of the program, and by so, by executing a line before the UB source code line. Here is how.

The compiler sees there is UB somewhere, so anything can happen, including the compiler optimizing the UB away.

Should be no undefined behavior in the compiled program, then TimeMachine is assumed to be initialized by the time it is called. The only thing that initializes TimeMachine is the TemporalParadox()‘s line TimeMachine = DeLorean, so in a well formed program, this line must have been called beforehand.

As TimeMachine() is called at the beginning of the program, the effects of line TimeMachine = DeLorean must be in place before this line, therefore before the beginning of the program.

TimeMachine is not assigned to anything else, and there is no direct call to TemporalParadox(), so the compiler then can assumes that TimeMachine is single initialized and, at same time, that abort() is not called.

Because TimeMachine is only assigned to DeLorean, DeLorean() can be inlined in all places where TimeMachine() is called. And by the magic of a non existing time traveling machine, Marty got back to 1955.

This is obviously an adaptation of Mature Pessimizations’s Undefined behavior can literally erase your hard disk, and there is some points of note:

  • The impact of the undefined behaviour is not limited to code which runs after the line with undefined behaviour: there is no code running after the UB line code;

  • To generate the result, the compiler is effectively running dead code, before the explicit UB line code;

  • Further, the static inline is only possible if the initialization occurs before any program line even runs, so in a way, the UB line is generating effects before any program line, UB or not UB.

TL;DR: The mere presence of UB can time travel dead code into existence, and generate impossible printing to stdout on various latest compilers.

3

This might be too obvious, or not quite within the rules:

#include <stdio.h>

static void foo(void);

int main(void) {
    printf("Everything ok so farn");
    fflush(stdout);
    foo();
}

static void foo(void) {
    volatile char huge[999999999] = {0};
}

https://godbolt.org/z/arTW5KxEn

You might think that nothing in the function foo could possibly prevent the message from being printed. But clang inlines it, and then tries to do all stack allocation in the prologue of main, which overflows the stack and crashes the program before the message is printed.

This might not technically count as UB. I had a recollection that the standard had something like “if implementation-defined limits are exceeded then the behavior is undefined” but I don’t actually see it anywhere.

1

If an implementation allows distinguishing between the traps associated iwth different actions where the Standard waives jurisdiction, operations that are expected to be performed without side effects may be reordered across each other, causing the “wrong” kind of trap being fired.

int  *volatile v;
void test(int x)
{
    for (int i=0; i<1000; i++)
    {
        *v = 1;
        int q = 1000/x;
        *v = q;
    }
}
void (*volatile vtest)(int)=test;
int main()
{
    vtest(0);

}

This is demonstrable in clang and gcc with godbolt; changing the divide expression to something that wasn’t loop-invariant like i/x would result in the first access to *v being performed before the division (causing a SIGSEGV), but with the code as written above, the division is performed first (yielding a SIGFPE).

11

I am asking for an example that would demonstrate a difference between the real undefined behaviour “all bets are off as soon as the program gets an input for which the behaviour of the program is undefined” and a fictional version of undefined behaviour “the program executes normally until it hits the line that causes undefined behaviour and after that all bets are off”.

This phrasing is somewhat easier than the initial form of the question. Consider this:

int value;
static void setvalue() { value = 1; };
[[gnu::noinline]] static void ub() {
    setvalue();
    std::unreachable();
}
int main() {
    ub();
    return value;
}

UB is triggered by execution passing through std::unreachable(). In a fictional version of UB where execution prior to that is unchanged, we would have value == 1. This is observed by exit code after that, but the actual execution and logic which changes is before. godbolt

And this example is illuminating for the general case. The cases of UB that are caused by evaluating some expression (division by 0, call to std::unreachable, dereference invalid pointer, etc.) can be assumed not to happen. So you can imagine these as if these evaluations are replaced with a call std::unreachable(). That is a correct transformation.

Then you can imagine any code that unconditionally leads to that execution is dead code and can be removed.

And that is why some of the examples of time travel are unsatisfying. Opaque function calls to functions in some other library could never return. In such a case, the compiler cannot prove that the bad evaluation will definitely happen. This can include standard library I/O and syscalls. (This is less true with LTO, which can see across library boundaries.)

So… it can be difficult to cause a different observation prior to the execution of the bad expression, but the actual logic executed prior to that can easily and obviously change and that can be observed in some less common ways before the execution (e.g. shared memory, smarter compilers, LTO) and in any way after (notably without the logic after being altered or incorrect).

4

Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa Dịch vụ tổ chức sự kiện 5 sao Thông tin về chúng tôi Dịch vụ sinh nhật bé trai Dịch vụ sinh nhật bé gái Sự kiện trọn gói Các tiết mục giải trí Dịch vụ bổ trợ Tiệc cưới sang trọng Dịch vụ khai trương Tư vấn tổ chức sự kiện Hình ảnh sự kiện Cập nhật tin tức Liên hệ ngay Thuê chú hề chuyên nghiệp Tiệc tất niên cho công ty Trang trí tiệc cuối năm Tiệc tất niên độc đáo Sinh nhật bé Hải Đăng Sinh nhật đáng yêu bé Khánh Vân Sinh nhật sang trọng Bích Ngân Tiệc sinh nhật bé Thanh Trang Dịch vụ ông già Noel Xiếc thú vui nhộn Biểu diễn xiếc quay đĩa Dịch vụ tổ chức tiệc uy tín Khám phá dịch vụ của chúng tôi Tiệc sinh nhật cho bé trai Trang trí tiệc cho bé gái Gói sự kiện chuyên nghiệp Chương trình giải trí hấp dẫn Dịch vụ hỗ trợ sự kiện Trang trí tiệc cưới đẹp Khởi đầu thành công với khai trương Chuyên gia tư vấn sự kiện Xem ảnh các sự kiện đẹp Tin mới về sự kiện Kết nối với đội ngũ chuyên gia Chú hề vui nhộn cho tiệc sinh nhật Ý tưởng tiệc cuối năm Tất niên độc đáo Trang trí tiệc hiện đại Tổ chức sinh nhật cho Hải Đăng Sinh nhật độc quyền Khánh Vân Phong cách tiệc Bích Ngân Trang trí tiệc bé Thanh Trang Thuê dịch vụ ông già Noel chuyên nghiệp Xem xiếc khỉ đặc sắc Xiếc quay đĩa thú vị
Trang chủ Giới thiệu Sinh nhật bé trai Sinh nhật bé gái Tổ chức sự kiện Biểu diễn giải trí Dịch vụ khác Trang trí tiệc cưới Tổ chức khai trương Tư vấn dịch vụ Thư viện ảnh Tin tức - sự kiện Liên hệ Chú hề sinh nhật Trang trí YEAR END PARTY công ty Trang trí tất niên cuối năm Trang trí tất niên xu hướng mới nhất Trang trí sinh nhật bé trai Hải Đăng Trang trí sinh nhật bé Khánh Vân Trang trí sinh nhật Bích Ngân Trang trí sinh nhật bé Thanh Trang Thuê ông già Noel phát quà Biểu diễn xiếc khỉ Xiếc quay đĩa
Thiết kế website Thiết kế website Thiết kế website Cách kháng tài khoản quảng cáo Mua bán Fanpage Facebook Dịch vụ SEO Tổ chức sinh nhật