Hello,
I have a beam line simulation with 4 quadrupoles and three dipoles with the magnetic fields being read from separate files for all the magnets. The simulation runs for few events and the positions are also as expected from the magnetic field definition but the run is getting killed after about 12 events. The crash is not at the same point always. For the magnetic field I adapted the TabulatedField3D.cc as given in the advanced purging magnet example and the field calculations seem to be working fine. But I am not sure why the run is getting killed every time. In the detector construction I have defined the fields like:
field_Q29= new TabulatedElementField3D(…);
field_Q30= new TabulatedElementField3D("…);
//This is thread-local
fEquation_Q29 = new G4Mag_UsualEqRhs (field_Q29);
fFieldMgr_Q29 = G4TransportationManager::GetTransportationManager()->GetFieldManager();
fStepper_Q29 = new G4ClassicalRK4 (fEquation_Q29);
fFieldMgr_Q29 = new G4FieldManager(field_Q29);
fChordFinder_Q29 = new G4ChordFinder(field_Q29,1*mm,fStepper_Q29);
fFieldMgr_Q29->SetChordFinder(fChordFinder_Q29);
fEquation_Q30 = new G4Mag_UsualEqRhs (field_Q30);
fFieldMgr_Q30 = G4TransportationManager::GetTransportationManager()->GetFieldManager();
fStepper_Q30 = new G4ClassicalRK4 (fEquation_Q30);
fFieldMgr_Q30 = new G4FieldManager(field_Q30);
fChordFinder_Q30 = new G4ChordFinder(field_Q30,1*mm,fStepper_Q30);
fFieldMgr_Q30->SetChordFinder(fChordFinder_Q30);
I have checked the geometry overlap as well and that is ok. I do have a complicated geometry with boolean solids. But the run works for few (10-12) events. Please let me know how I can fix this as I am not sure why exactly it crashes…
Thank you very much in advance. Any help is appreciated. I am kind of stuck at this point.
Just to add some more details my magnetic field class is written such that the field is translated according to the logical volume that the field is attached to. I am also deleting my field and field manger in the destructor of my detector construction. Please let me know if I should provide any more details. My simulation runs fine for about 13 events then it crashes. Its most probably due to some memory leak but I am really not sure where the problem is. Any help is very much appreciated. Thank you in advance,
The code which you write seems fine - I do not see a problem in the magnetic field definition. It’s code is only called once at the start of a run for each thread anyway, so it does not seem likely that it will cause a memory leak. If you wanted to be 100% sure, you could reduce the number of field elements by a factor of 2-4x and see whether the number of events which work increases.
But that would not be definitive, in case the path of a track was very different.
I suggest to look at other parts of your program for a likely cause. And to use a different way to monitor the amount of memory used by your program. If you are on Unix (Linux/Mac) you look at the output of ‘top’ or ps to identify the memory used by the program, for example.
Beyond that, without more information about how your crash / problem occurs, and in particular without the signal or printout from a crash, it is difficult to give any suggestions.
When I have had similar problems I have run the program from within a debugger - gdb or lately lldb. A graphical debugger that uses one of these will do great too. Then at least you can report where in the program an issue occurs.
Dear John,
Thank you so much for your reply. I was trying to check with gdb but its getting stuck. So I basically get stuck at the message New Thread. Am I missing something ?
The two red chunks are basically when the program was running. Please let me know how I can better use the debugger tool. I think this will help a lot. Thank you very much in advance…
How can I compile the code such that valgrind includes the line numbers in the output. So something similar to the -g option for compiling … Thank you in advance for any help…
I suggest to compile and run an unchanged example program, for example basic B1, with your installation to check that the installation works. Unless you get the expected results from example programs, it is difficult to get your own program to work.
Once you have a program that is running, using ‘valgrind’ is a reasonable way to diagnose memory issues. And that using Geant4 compiled with debugging is the best way to get the line number of a problem. ( You can also have ‘ReleaseWithDebug’ which is faster but less reliable for finding the cause of an issue. )
For Geant4 installation options, please look at the Geant4 Installation Guide. It has all the instructions, including what to expect. ( There is too much context to repeat the instructions here. )
However I will note that nothing in the dump from valgrind has anything to do with Geant4.
Regarding memory use you will need to look at the RSS column of the ‘top’ output while the program is running. You can note down the values as it is running or try to recall the maximum value.
If you do not have experience using ‘gdb’, you could use a graphical debugger for your system instead. You will need to find a tutorial - again there is too much to give clear instructions.
Dear John,
Thank you so much for all your suggestions. So I reinstalled Geant4 with the debugging options and tried to run example B1 without changes. The example runs with no problem. I then tried valgrind on my code and the output shows some errors and memory leakage in some Geant4 headers and source codes but I couldn’t find anything particular about my code. I also tried gdb but when I run gdb I get the error that the data environment variables aren’t set even though it is. When I just run the executable outside gdb it runs without this exception error. Can you please help to understand why gdb throws this exception even though the path variables are defined ? Please let me know if you have any suggestions as now the example runs but my code still just gets killed. I also checked all geometry overlaps and this all seems ok. Thank you very much in advance. I really appreciate the help…
Dear John,
I just started to build my code from scratch on top of the B1 example and this seems to work fine now. The major change I made was to initialise the pointers to all the solid, logical, physical volumes as well as the field components as null pointers in the detector construction class. Maybe this helped fix the problem. I also removed all messenger classes but I don’t think this was the issue. In any case its not crashing anymore. I checked up to 50 events (before it was crashing at 13). Thank you very much for all your help and suggestions. I really appreciate it…