Multithreading streaming vs batch execution model

austinschneider · September 11, 2024, 4:59pm

Geant4 Version: 11.2.2
Operating System: SUSE Linux Enterprise Server 15 SP4
Compiler/Version: g++ (GCC) 12.1.0 20220506 (HPE)
CMake Version: 3.25.1

I am using Geant4 in the context of a larger processing framework where the Geant4 module is just one of many processing stages and events are continuously streamed from one processing module to the next. This means that events are being generated external to Geant4, but need to be passed to Geant4 for simulation before the results are put back into the processing stream.

For single-threaded usage this works fine with Geant4’s processing model because we could start a G4Run “manually” without calling BeamOn, and while streaming events call SetNumberOfEventToBeProcessed and ProcessOneEvent to update the run and process new events.

However, in multi-threaded usage it seems a little harder to get away from processing events in a batch of predetermined size (corresponding to a G4Run). As far as I can tell, trying to implement “streaming” processing for multi-threaded running might involve some changes to the worker run managers and/or the classes that instantiate those worker run managers.

I am mainly looking for some advice on how best to approach doing some kind of streaming processing in the multi-threading mode before I naively dive into it.

For context, in the multi-threaded processing I am currently storing a pointer to a “ParticleList” class within the PrimaryGeneratorAction that is queried for the next event each time a thread attempts call the PrimaryGeneratorAction. The “ParticleList” class has only once instance shared across the threads and uses a standard library mutex and lock_guard to handle access. With this implementation we are accumulating a batch of events from the processing stream and then handing that batch of events to G4 to process as a single run. However, batching events like this seems to have some non-negligible overhead costs in starting/stopping the run, and increasing the batch size to encompass the full list of events is both a little clunky and could use too much memory if the generated events are not light-weight.

A simple solution to this (keeping the current implementation) would be if there is a clean / thread-safe way to increase the number of events in a run on the fly, and a clean/thread-safe way to either prevent the workers from shutting down while they are waiting for the next event / waiting for the events_to_be_processed in the run to be increased or a clean/thread-safe way to shut down the workers even if they are expecting another event.

mkelsey · September 12, 2024, 4:08am

You are probably right that handling this in MT might involve new worker-level run managers. If you haven’t seen it already, the detailed MT “architecture” documentation might be helpful in identifying what to modify (Parallelism in Geant4: multi-threading capabilities — Book For Toolkit Developers 11.2 documentation, Geant4MTAdvandedTopicsForApplicationDevelopers < Geant4 < TWiki, Geant4MTTipsAndTricks < Geant4 < TWiki).