Does anybody still use IgProf? Is it MT-compatible?

TL;DR: is anyone in the Geant4 community still using the IgProf profiler? If so, are you able to use it successfully to profile multithreaded G4 jobs?

Our experiment’s simulation is running under Geant4 10.7.p04. We are running simulations including a detector response simulation of phonons and charge carriers via the G4CMP library, so individual events typically take anywhere from 15 minutes to an hour (depending on voltage bias). So we run multithreaded, and it works very well.

Except that once we get above about 20 threads, there’s a lot of overhead that I don’t understand. The System CPU, and hence the elapsed wall-clock time for job, grows linearly with the number of threads above 20. By the time we get to 40 threads, events can take six to eight hours each! That’s obviously untenable.

I presume the problem is somewhere in our simulation framework, not in G4. I’d like to use IgProf to evaluate where the problem is going, but the only stuff I see from it is the master thread, up to /run/initialize. Nothing from any of the worker threads.

As an aside. I can’t seem to use Valgrind for this: it complains almost immediately about an “illegal instrudtion”, and aborts. I only run into this with G4 applications on our compute cluster. But that’s why I’m trying to use IgProf.

I think this is still used in the Testing and QA Working Group for regular benchmarking, so perhaps worth dropping them a line (I don’t see members on the Forum…).

I also have this error in valgrind but I think its related to avx512 instructions which aren’t supported by valgrind.

Thank you! That might explain why I see this on our cluster with a natively built G4 and Valgrind, but I don’t see it in a Singularity container built elsewhere.

1 Like

This topic was automatically closed 2 days after the last reply. New replies are no longer allowed.