Geant4 Version: 10.07.p04
Operating System: RHEL8
Compiler/Version: 12.2.0
CMake Version: 3.24.3
We’re seeing a rare segfault in some jobs, for which I can’t seem to get useful information. We build our executable against ROOT (6.28) which gives us a traceback for free. All of the worker threads generate their own
*** Break *** segmentation violation
message, and the tracebacks from all the worker threads are:
#4 <signal handler called>
#5 0x000015535f9b94fd in __call_tls_dtors () from /lib64/libc.so.6
#6 0x000015535fd461d8 in start_thread () from /lib64/libpthread.so.0
#7 0x000015535f9a1953 in clone () from /lib64/libc.so.6
(frames 0-3 are ROOT’s traceback generator). I presume “tls_dtors” refers to the destructors from thread-local singletons, but it’s not giving any frames for any of the Geant4 singletons, nor any of the once we have in our simulation framework.
I am not able to reproduce this error – it seems to happen on just a few jobs when we use unique random seeding (via UUID), but I can’t get it to happen when I pick my own seeds (/random/setSeeds with number strings).
Has anyone else seen this failure? If so, were you able to track it down to something specific?