Geant4 Version: 11.4.1
Operating System: AlmaLinux 10 (RHEL)
Compiler/Version: GCC 14.3.1
CMake Version: 3.30.5
Hey G4 folks,
I have been quite interested in implementing sub-event parallelism in my application space for quite a while, so now that it’s available in Geant4, I’m eager to try it out. The extended/runAndEvent/RE03 example provided a good starting place, but I haven’t had success in expanding beyond that. I know sub-event parallelism is a work in progress, so this may be a byproduct of its phased introduction.
The issue occurs when you try to register more than one sub-event type for a given run. The second sub-event type will be appropriately registered to the master sub-event run manager, but no workers will ever be created to handle its sub-event type. This is because the worker sub-event run managers are created by G4UserSubEvtThreadInitialization, which doesn’t pass any arguments to the G4WorkerSubEvtRunManager constructor in CreateWorkerRunManager, and so all of the workers are created with sub-event type 0 (corresponding to track classification 100, or fSubEvent_0), the default in the G4WorkerSubEvtRunManager ctor. So if you set a particle’s (or track status’s) default classification to the second sub-event type you registered, the run will crash as soon as that particle/track status is encountered in the simulation.
It seems like there needs to be a way to pass sub-event type information to G4UserSubEvtThreadInitialization so that worker threads can be assigned to those types. The most generic method would be to just split workers evenly among sub-event types, which could be achieved by just passing the sub-event type map (or even just its size).
Have others encountered this bug?
Following up here: implementing the suggested change only had a limited impact. There appears to be some sort of disconnect between the master sub-event run manager and its workers: registration of default classifications for specific particle types appears to only affect the master run manager – the sub-event type map doesn’t appear to be passed down to the worker run managers. This results in an Event0051 exception on the first worker thread to encounter a particle with a sub-event default classification (e.g., fSubEvent_0) since those classifications have only been registered with the master run manager. Adding these registrations to the worker threads as well resolves the exception, but strangely the registrations don’t seem to have “taken”: particle types registered to fSubEvent_0, for example, will still come out ofG4StackManager::PushOneTrack with fUrgent classification.
One final note: it seems that some logic in G4EventManager::DoProcessing is short-circuiting the sub-event loop. The final if statement checks if the current thread is a sub-event worker and then ostensibly checks to see if the current sub-event has completed or not. But instead of actually doing that, the code currently just sets the current sub-event as completed regardless of its actual status. This means that the sub-events are skipped entirely. There is commented-out text explaining that the user should add incomplete events to the processingEvents vector and completed events to the completedEvents vector, but implementing this still does nothing, because processingEvents doesn’t seem to be accessed anywhere in the code.
This development suggests to me that sub-event parallelism does not work in Geant4 11.4.1, but I would love to be wrong about that!
I presume you implement CreateWorkerRunManager() method in your UserSubEvtThreadInitialization class and set it to G4SubEvtRunManager. That is the correct way to go.
Please note that you can classify the sub-event type only in the master thread. In the worker thread, there is no way to “re-classify” a track and send it back to the master or to other worker. You will surely get Evet0055 exception if you classify a sub-event type for a track in a worker thread.
I do not see any issues you described by myself. My code works with three sub-event types. Of course I’m not re-classifying tracks in a worker thread to other sub-event type. The only issue I’m currently aware of is with the command-based scorer (which I’m addressing now).
I would appreciate it if you could provide us a sample code to reproduce your issues (without re-classifying sub-event type in a worker thread).
Makoto,
Thanks for your response. It seems probable that my implementation is bad if you’re able to use multiple sub-event types without issue. I’ve reproduced my basic methodology in the pseudocode below. If I only register sub-event type 0 (as in the RE03 example), the run executes but my scorer never fills since G4Run::RecordEvent() doesn’t seem to get called (which is where my scorer fills my ntuples). My confusion partially stems from the fact that worker run actions aren’t allowed when using sub-event parallelism; does that also imply that only one scorer object (for the master thread) should be created?
If I register both sub-event types 0 and 1 (as below), the run fails with a SubEvtRM1210 exception because there are no workers assigned to sub-event type 1. It was this issue that prompted my investigation of UserSubEvtThreadInitialization, but based on your experience it seems like my implementation was to blame, not UserSubEvtThreadInitialization.
void ActionInitialization::BuildForMaster() const
{
SetUserAction(new RunAction(…));
auto* rm = G4RunManager::GetRunManager();
if (rm->GetRunManagerType() == G4RunManager::subEventMasterRM)
{
SetUserAction(new PrimaryGeneratorAction(…));
rm->RegisterSubEventType(0, 100);
rm->RegisterSubEventType(1, 100);
rm->SetDefaultClassification(G4Electron::Definition(), fSubEvent_0);
rm->SetDefaultClassification(G4Gamma::Definition(), fSubEvent_1);
}
}
void ActionInitialization::Build() const
{
auto* rm = G4RunManager::GetRunManager();
if (rm->GetRunManagerType() != G4RunManager::subEventWorkerRM)
{
SetUserAction(new PrimaryGeneratorAction(…));
SetUserAction(new RunAction(…));
}
else
{
createScorer();
}
SetUserAction(new TrackingAction(…));
SetUserAction(new EventAction(…));
SetUserAction(new SteppingAction(…));
SetUserAction(new StackingAction(…));
}
I can go into greater detail about our scoring methodology or other aspects of our simulations if you feel it would be helpful.
I get the same results between the tasking mode and the sub-event mode with two sub-event types.
I cannot tell what went wrong on your side without looking into your action classes. But a few things you may need to check.
In the sub-event parallel mode, the concept of “run” does not exist in worker threads. Worker threads receive series of sub-events and process them. “Run” exists only in the master thread.
In the same context, the beginning and end of “event” make sense only in the master thread. If you need to access all hits of the event, it should be done in the master thread.
TrackingAction and SteppingAction make sense in worker thread, but you should set them both in master and worker threads. Note that primary event is generated in the master thread, and tracks that are not sent to worker threads are processed in the master thread.
Makoto,
Thank you, your hints got me moving in the right direction – I also realize now that example RE03 provides a similar conceptual explanation in the comments. Moving my EventAction and adding SteppingAction and TrackingAction registration to BuildForMaster and moving my scorer’s Fill method to EventAction::EndOfEventAction got my scoring working properly again. I also added StackingAction registration to the master thread, though I don’t know that that step was necessary.
For the sake of posterity, my earlier posts suggesting that sub-event level parallelism might not be possible in 11.4.1 were unfounded, it was my mistake. It’s rather impressive how much more efficiently simulations can run with this feature. It would be nice to be able to re-classify tracks created on worker threads (or maybe that’s in the works?), but even despite that the computing time gains are enormous.
Thanks once again for all your help Makoto!
Passing tracks between worker threads is not practical, as worker tasks are running asynchronously and we need to introduce complex Mutex barriers that would destroy performance. Thus, tracks must be transferred through the master. One additional burden is, passing G4Track between master and worker is a deep copy, as G4Track object is instantiated by thread-local G4Allocator and thus cannot be handed over to other thread. Touchable cannot be passed by the same reason.
Currently, tracks are sent (i.e. copied) unidirectional from master to worker. Hits/scores are sent from worker to master. Trajectories are optionally sent from worker to master. Sending tracks from worker to master may be considered in the future, if needs are.