I’m trying to understand the below crash and cannot get my head around why/if the thread_local has something to do with it.
Here is how the crash occurs, how I fixed it and my incomplete understanding asI do not expect it to crash initially (with a proton::container stack object)
I start a thread (let’s call it ProtonThread, inheriting from thread and proton::messaging_handler) and in that thread main method I do proton::container::run(*this); in a scope.
This will trigger the https://github.com/apache/qpid-proton/blob/88850e2da63e37749536d64daf23c659c896fd2a/cpp/src/uuid.cpp#L47 for the generation of the id which means a thread_local is created in this thread.
Once the proton::container::run() will finish the scope will finish and the stack object is destroyed (at ; but I also tried with a variable).
The “engine” should not be impacted by this and get destroyed at thread exit. I do join the thread, see that I do a clean exit so it looks fine to me.
Then I start the thread again. Now I crash when I try to run the container.
One solution I found to this is to not use an stack object in the thread main function but a member in that class (ProtonThread). This way the proton::container member will be created from the thread constructing ProtonThread which makes the “engine” a thread_local on that thread.
This guaranteed that “engine” is alive as it’s thread_local to another thread not ProtonThread.
Now, my expectation was that, even with ProtonThread the thread_local engine should work fine.
I would appreciate any hint on why this might not be the case
Program terminated with signal SIGABRT, Aborted.
#0 0x00007f13ede8b94c in __pthread_kill_implementation () from /lib64/libc.so.6
[Current thread is 1 (Thread 0x7f13e6ffb640 (LWP 792836))]
Missing separate debuginfos, use: dnf debuginfo-install cyrus-sasl-gssapi-2.1.27-21.el9.x86_64 cyrus-sasl-lib-2.1.27-21.el9.x86_64 cyrus-sasl-plain-2.1.27-21.el9.x86_64 gdbm-libs-1.19-4.el9.x86_64 glibc-2.34-100.el9_4.4.x86_64 jsoncpp-1.9.5-1.el9.x86_64 keyutils-libs-1.6.3-1.el9.x86_64 krb5-libs-1.21.1-2.el9_4.x86_64 libcom_err-1.46.5-5.el9.x86_64 libgcc-11.4.1-3.el9.x86_64 libnsl2-2.0.0-1.el9.x86_64 libselinux-3.6-1.el9.x86_64 libstdc++-11.4.1-3.el9.x86_64 libtirpc-1.3.3-8.el9_4.x86_64 libxcrypt-4.4.18-3.el9.x86_64 pcre2-10.40-5.el9.x86_64 qpid-proton-c-0.37.0-2.el9.x86_64 qpid-proton-cpp-0.37.0-2.el9.x86_64 zlib-1.2.11-40.el9.x86_64
(gdb) where
#0 0x00007f13ede8b94c in __pthread_kill_implementation () from /lib64/libc.so.6
#1 0x00007f13ede3e646 in raise () from /lib64/libc.so.6
#2 0x00007f13ede287f3 in abort () from /lib64/libc.so.6
#3 0x00007f13ede29130 in __libc_message.cold () from /lib64/libc.so.6
#4 0x00007f13ede959f7 in malloc_printerr () from /lib64/libc.so.6
#5 0x00007f13ede9671c in malloc_consolidate () from /lib64/libc.so.6
#6 0x00007f13ede982f8 in _int_malloc () from /lib64/libc.so.6
#7 0x00007f13ede99809 in malloc () from /lib64/libc.so.6
#8 0x00007f13ee73a75e in malloc (size=<optimized out>) at ../include/rtld-malloc.h:56
#9 allocate_dtv_entry (size=<optimized out>, alignment=8) at ../elf/dl-tls.c:730
#10 allocate_and_init (map=0x1f575e0) at ../elf/dl-tls.c:759
#11 tls_get_addr_tail (ti=0x7f13ed3fde58, dtv=0x7f13e00018f0, the_map=0x1f575e0) at ../elf/dl-tls.c:970
#12 0x00007f13ee73e76c in __tls_get_addr () at ../sysdeps/x86_64/tls_get_addr.S:55
#13 0x00007f13ed3dd187 in proton::uuid::random() () from /lib64/libqpid-proton-cpp.so.12
#14 0x00007f13ed3c6bfe in proton::container::container(proton::messaging_handler&) () from /lib64/libqpid-proton-cpp.so.12
1