Opticals get stuck forever and cause program to hang

I’m working on a scintillator detector model and I’ve been having an intermittent fault where the executable hangs near the end of a run. I’ve pinpointed why it hangs but I’m not sure what’s leading to the situation.

Let me preface this by noting that I have run /geometry/test/run with /geometry/resolution 1000000 test points per volume and /geometry/tolerance 0 mm and there are no overlaps.

I’ve been examining the backtraces of each thread in lldb and identified that it’s usually just one thread that intermittently gets stuck propagating an optical forever. I only launch 100 primaries (opticals), and after a few successful runs the program hangs. This happens in batch and interactive modes.

Here are a few backtraces of the problem thread, from oldest to newest (all one run):

backtrace 1 (click to expand)
* thread #6
    frame #0: 0x000000010493c9c9 libG4processes.dylib`G4TouchableHistory::operator new(unsigned long) + 105
    frame #1: 0x0000000105457e2c libG4processes.dylib`G4Navigator::LocateGlobalPointAndUpdateTouchableHandle(CLHEP::Hep3Vector const&, CLHEP::Hep3Vector const&, G4ReferenceCountedHandle<G4VTouchable>&, bool) + 60
    frame #2: 0x0000000105457b0a libG4processes.dylib`G4Transportation::PostStepDoIt(G4Track const&, G4Step const&) + 74
    frame #3: 0x00000001010dd5c4 libG4tracking.dylib`G4SteppingManager::InvokePSDIP(unsigned long) + 68
    frame #4: 0x00000001010dd4ab libG4tracking.dylib`G4SteppingManager::InvokePostStepDoItProcs() + 139
    frame #5: 0x00000001010da97c libG4tracking.dylib`G4SteppingManager::Stepping() + 556
    frame #6: 0x00000001010ee8d0 libG4tracking.dylib`G4TrackingManager::ProcessOneTrack(G4Track*) + 416
    frame #7: 0x000000010104a6cd libG4event.dylib`G4EventManager::DoProcessing(G4Event*) + 2173
    frame #8: 0x0000000100fcbdee libG4run.dylib`G4WorkerRunManager::ProcessOneEvent(int) + 46
    frame #9: 0x0000000100fcbd7e libG4run.dylib`G4WorkerRunManager::DoEventLoop(int, char const*, int) + 238
    frame #10: 0x0000000100fc1502 libG4run.dylib`G4RunManager::BeamOn(int, char const*, int) + 98
    frame #11: 0x0000000100fcf3f2 libG4run.dylib`G4WorkerRunManager::DoWork() + 946
    frame #12: 0x0000000100fe743f libG4run.dylib`G4MTRunManagerKernel::StartThread(G4WorkerThread*) + 559
    frame #13: 0x0000000100fee58c libG4run.dylib`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(G4WorkerThread*), G4WorkerThread*> >(void*) + 44
    frame #14: 0x00007ff800c8f4e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #15: 0x00007ff800c8af6b libsystem_pthread.dylib`thread_start + 15
backtrace 2 (click to expand)
* thread #6
  * frame #0: 0x0000000100a987e1 libG4track.dylib`G4Track::CalculateVelocityForOpticalPhoton() const + 545
    frame #1: 0x0000000100a90d24 libG4track.dylib`G4ParticleChange::Initialize(G4Track const&) + 164
    frame #2: 0x0000000105428f70 libG4processes.dylib`G4OpBoundaryProcess::PostStepDoIt(G4Track const&, G4Step const&) + 48
    frame #3: 0x00000001010dd5c4 libG4tracking.dylib`G4SteppingManager::InvokePSDIP(unsigned long) + 68
    frame #4: 0x00000001010dd4ab libG4tracking.dylib`G4SteppingManager::InvokePostStepDoItProcs() + 139
    frame #5: 0x00000001010da97c libG4tracking.dylib`G4SteppingManager::Stepping() + 556
    frame #6: 0x00000001010ee8d0 libG4tracking.dylib`G4TrackingManager::ProcessOneTrack(G4Track*) + 416
    frame #7: 0x000000010104a6cd libG4event.dylib`G4EventManager::DoProcessing(G4Event*) + 2173
    frame #8: 0x0000000100fcbdee libG4run.dylib`G4WorkerRunManager::ProcessOneEvent(int) + 46
    frame #9: 0x0000000100fcbd7e libG4run.dylib`G4WorkerRunManager::DoEventLoop(int, char const*, int) + 238
    frame #10: 0x0000000100fc1502 libG4run.dylib`G4RunManager::BeamOn(int, char const*, int) + 98
    frame #11: 0x0000000100fcf3f2 libG4run.dylib`G4WorkerRunManager::DoWork() + 946
    frame #12: 0x0000000100fe743f libG4run.dylib`G4MTRunManagerKernel::StartThread(G4WorkerThread*) + 559
    frame #13: 0x0000000100fee58c libG4run.dylib`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(G4WorkerThread*), G4WorkerThread*> >(void*) + 44
    frame #14: 0x00007ff800c8f4e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #15: 0x00007ff800c8af6b libsystem_pthread.dylib`thread_start + 15
backtrace 3 (click to expand)
* thread #6
  * frame #0: 0x00000001049164e6 libG4processes.dylib`G4VProcess::SubtractNumberOfInteractionLengthLeft(double) + 38
    frame #1: 0x00000001054254e1 libG4processes.dylib`G4VDiscreteProcess::PostStepGetPhysicalInteractionLength(G4Track const&, double, G4ForceCondition*) + 65
    frame #2: 0x00000001010dbe4e libG4tracking.dylib`G4SteppingManager::DefinePhysicalStepLength() + 238
    frame #3: 0x00000001010da8c5 libG4tracking.dylib`G4SteppingManager::Stepping() + 373
    frame #4: 0x00000001010ee8d0 libG4tracking.dylib`G4TrackingManager::ProcessOneTrack(G4Track*) + 416
    frame #5: 0x000000010104a6cd libG4event.dylib`G4EventManager::DoProcessing(G4Event*) + 2173
    frame #6: 0x0000000100fcbdee libG4run.dylib`G4WorkerRunManager::ProcessOneEvent(int) + 46
    frame #7: 0x0000000100fcbd7e libG4run.dylib`G4WorkerRunManager::DoEventLoop(int, char const*, int) + 238
    frame #8: 0x0000000100fc1502 libG4run.dylib`G4RunManager::BeamOn(int, char const*, int) + 98
    frame #9: 0x0000000100fcf3f2 libG4run.dylib`G4WorkerRunManager::DoWork() + 946
    frame #10: 0x0000000100fe743f libG4run.dylib`G4MTRunManagerKernel::StartThread(G4WorkerThread*) + 559
    frame #11: 0x0000000100fee58c libG4run.dylib`void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (*)(G4WorkerThread*), G4WorkerThread*> >(void*) + 44
    frame #12: 0x00007ff800c8f4e1 libsystem_pthread.dylib`_pthread_start + 125
    frame #13: 0x00007ff800c8af6b libsystem_pthread.dylib`thread_start + 15

Clearly the optical is still propagating but it’s gotten stuck. I have had similar issues in the past when opticals are launched from within a material with a G4OpticalSkinSurface, but that’s not what’s happening here (they’re launched inside of a crystalline material with a few optical boundaries but no skin surface).

Any ideas as to what is happening? Maybe I could just kill the optical if its track length gets too long, but I’m not sure how to propagate cuts like that to all of the volumes in the detector geometry. That also seems like a stopgap for a larger problem going on here.

Thanks in advance!
William

After taking inspiration from this post about a multi-threaded application hanging, I realized that I had a std::map within my SteppingAction class to convert between optical boundary status and a string (for debugging). That map wasn’t marked as const, and I was accessing it like this: map[key]. The operator[] is not const and also not thread-safe, so this led to a race condition and likely got a photon stuck within a volume that has a skin surface, so it ended up propagating forever.

I fixed it by changing the std::map to a static const std::map and accessing it via map.at() instead of map[].

Cheers!

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.