Run getting killed with more than 10^7 events

I able to do my simulations with 10^7 events perfectly using /run/beamOn 10000000 in the macro file named vis.mac. But when I am going for more than 10^7 , lets say 10^8, its taking more time but couldn’t complete .Instead after almost 20hr its showing ‘killed’ . In my opinion there shouldn’t be any problem If I generate the event up to 2,147,483,647 in a single run . What are the reasons for this problem?

How much time do your 10^7 events jobs take to complete? A 10^8 job will take 10 times longer, of course.

How much time are you allocating for your batch jobs? Does your estimated time for the 10^8 job exceed that limit?

For 10^7 events it takes around 16 hr but for 10^8 its taking more time and couldn’t complete.
I don’t know about allocating time for batch jobs. I didn’t put any time condition .
what could be the reason that it taking more time? I am using geant4 version 11.1 .

Similar code I tried with another machine which has geant4 version 10 . This machine takes 5-6 min for 10^7 event and 1hr 15 min for 10^8 events but when I go for higher event like 10^9 the run getting killed .

What OS/machine specs (CPU/RAM etc) are you running each application on, and how many threads are being used on each? What batch system (e.g. Slurm etc are you using)? 16hr vs 5min is a huge difference, so there’s either something in the code, or these are vastly different systems in capability.

Both machines has similar specs.
The one which taking 5 min with geant4 10 versions has the spec below:

And the other one which have geant4 version 11.1also has same spec .

Both were installed keeping multithread off . And I am not using any batch to generate the event.

Another interesting fact I noticed that the code which I was running with G4 version 10 has no error .But when I run the code with G4 version 11.1 its showing “fatal error: g4root.hh: No such file or directory 36 | #include “g4root.hh” " . To overcome this situations I replaced this line with #include"G4AnalysisManager.hh” and it worked . I made no changes other than this line .

And the other one which have geant4 version 11.1also has same spec below

I just modified the B1 example. I am not using any batch file . My worry is why the generation of event get ‘killed’ after 48 hr or so . How can I solve this problem?

Without knowing the exact signal/cause of the program being killed, it’s impossible to know what’s going on. There’s nothing in Geant4 that imposes a runtime limit, so either something in the application code is causing something like a memory leak or (rare) segmentation violation. Otherwise, there’s something in the OS settings that’s automatically killing processes after a set time.

2 Likes

I don’t think it is relevant to the crashing problem, but the speed difference could possibly be explained by a difference in the CMAKE flags used when you built the GEANT4 code. Is your version 11 instance of GEANT4 built with the CMAKE_BUILD_TYPE set to Debug? This can make it run much slower. Release or RelWithDebInfo will run faster.

Is your version 11 instance generating a large visualization file, for example g4_01.wrl? Is it possible you are filling up your hard drive with a very large file? Writing to disk for every event would also slow down the execution significantly. If your version 10 instance was built with visualization disabled and your version 11 instance included it you could be getting different default behaviors.

There is no particular time when the run getting killed. I successfully able to generate up to 12*10^7 events . And this takes nearly 11hr to complete. After this run I generated 42111565 no of events using /run/beamOn 42111565 but this got crashed or killed after 2 hr .
Screenshot from 2023-01-11 11-15-40

Earlier this machine which was taking 5 min to complete 10^7 events, but now its taking more than 1 hr or so .

Both versions installed with CMAKE_BUILD_TYPE Release .

version 11 not generating any visualization file like .wrl