Please fill out the following information to help in answering your question, and also see tips for posting code snippets. If you don’t provide this information it will take more time to help with your problem!
Geant4 Version: 10.07.p04, 11.4.1
Operating System: RHEL8
Compiler/Version: GCC 12.2.0
CMake Version: 3.24.3
We have a custom G4CMPParticleChangeForPhonon class, which follows the model in G4ParticleChangeForTransport. We have a G4TouchableHandle object as a data member. During our tracking, this Handle is initialized to the touchable associated with the track, may be re-filled via a Propose() function, and is then reset to empty (= 0) at the end of UpdateStepForPostStep(). This works properly for tracking, with no memory leaks or other issues.
However, at the end of the job – after all the worker threads have already been deleted – we get an occasional and hard to reproduce segmentation fault complaining that the Handle’s Release() function is doing something bad. I ran using -fsanitize=thread, and get a consistent report about accessing data after it’s been deleted:
==================
WARNING: ThreadSanitizer: heap-use-after-free (pid=195009)
Read of size 4 at 0x7b61b888a460 by main thread (mutexes: write M18605):
#0 G4CountedObject<G4VTouchable>::Release() /scratch/group/mitchcomp/eb/x86_
64/sw/Geant4/10.7.4-foss-2022b-debug/include/Geant4/G4ReferenceCountedHandle.hh:
176 (libG4cmp.so+0xcd10e)
#1 G4ReferenceCountedHandle<G4VTouchable>::~G4ReferenceCountedHandle() /scra
tch/group/mitchcomp/eb/x86_64/sw/Geant4/10.7.4-foss-2022b-debug/include/Geant4/G
4ReferenceCountedHandle.hh:215 (libG4cmp.so+0xcd232)
#2 G4CMPParticleChangeForPhonon::~G4CMPParticleChangeForPhonon() include/G4C
MPParticleChangeForPhonon.hh:28 (libG4cmp.so+0xcd232)
#3 G4CMPPhononBoundaryProcess::~G4CMPPhononBoundaryProcess() src/G4CMPPhonon
BoundaryProcess.cc:101 (libG4cmp.so+0xd488c)
#4 G4CMPPhononBoundaryProcess::~G4CMPPhononBoundaryProcess() src/G4CMPPhonon
BoundaryProcess.cc:101 (libG4cmp.so+0xd4948)
#5 G4ProcessTable::~G4ProcessTable() <null> (libG4processes.so+0x100c8c2)
Previous write of size 8 at 0x7b61b888a460 by thread T2:
[failed to restore the stack]
Mutex M18605 (0x7fe2f3914a68) created at:
#0 pthread_mutex_lock ../../../../libsanitizer/sanitizer_common/sanitizer_co
mmon_interceptors.inc:4324 (libtsan.so.2+0x5a471)
#1 G4ThreadLocalSingleton<G4ProcessTable>::Register(G4ProcessTable*) const <
null> (libG4processes.so+0x100f7c4)
#2 SuperSim_Main::SuperSim_Main() /scratch/user/kelsey/software/supersim/CDM
Sapps/SuperSim_Main.cc:64 (libCDMSapps.so+0x8a72)
#3 main /scratch/user/kelsey/software/supersim/CDMSapps/CDMS_G4DMC.cc:28 (CD
MS_G4DMC+0x416a76)
Thread T2 (tid=195057, finished) created by main thread at:
#0 pthread_create ../../../../libsanitizer/tsan/tsan_interceptors_posix.cpp:
1001 (libtsan.so.2+0x62b86)
#1 std::thread::_M_start_thread(std::unique_ptr<std::thread::_State, std::de
fault_delete<std::thread::_State> >, void (*)()) /tmp/baum/easybuild/GCCcore/12.
2.0/system-system/gcc-12.2.0/stage3_obj/x86_64-pc-linux-gnu/libstdc++-v3/include
/x86_64-pc-linux-gnu/bits/gthr-default.h:663 (libstdc++.so.6+0xe0ddb)
#2 G4RunManager::BeamOn(int, char const*, int) <null> (libG4run.so+0x483a3)
#3 SuperSim_Main::Run(int, char**) /scratch/user/kelsey/software/supersim/CD
MSapps/SuperSim_Main.cc:134 (libCDMSapps.so+0x8993)
#4 main /scratch/user/kelsey/software/supersim/CDMSapps/CDMS_G4DMC.cc:39 (CD
MS_G4DMC+0x416b29)
SUMMARY: ThreadSanitizer: heap-use-after-free /scratch/group/mitchcomp/eb/x86_64
/sw/Geant4/10.7.4-foss-2022b-debug/include/Geant4/G4ReferenceCountedHandle.hh:17
6 in G4CountedObject<G4VTouchable>::Release()
==================
The main thing to notice is that the WARNING is coming from the main thread, not from one of the worker threads. The thing about this is that (as I understand things) every worker thread gets it’s own process instances, so they don’t collide. The processes which got created on the main (master) thread are never actually invoked. So the instance of G4CMPPhononBoundaryProcess on the master thread should still be in its initial state.
Next, notice that the “previous write” is reported to have come from thread T2, by way of a mutex. Since T2 has already been deleted, the traceback for that write is gone. But why would T2 have been writing back into a TouchableHandle on the master thread? Shouldn’t each worker thread have their own local objects?
The mutex itself is reported to have come from a thread-local singleton of G4ProcessTable, presumably by thread T2? But why should a G4ThreadLocalSingleton need a mutex to write into itself or a thread-local object it owns?
Other than G4ParticleChangeForTransport, which is what we followed when writing our own PC, are there any other examples we can look at to understand what we’re doing wrong, and what we should be doing differently?