GeomNav0002: "Volume must be centered on the origin."

Geant4 Version: 10.07.p04 + Bugzilla #2714
Operating System: MacOS Tahoe 26.4.1
Compiler/Version: Apple Clang 21.0.0
CMake Version: 4.3.2


This is a followup to Stale pointers in parallel worlds? - #7 by mkelsey. I was able to successfully port the fix from Gabriele in Bugzilla #2714 into our CDMS simulation framework, building against Geant4 10.07.p04. It worked perfectly, but I now get a fatal error on the second geometry build:

G4WT0 > 
-------- EEEE ------- G4Exception-START -------- EEEE -------
*** G4Exception : GeomNav0002
      issued by : G4Navigator::SetWorldVolume()
Volume must be centered on the origin.
*** Fatal Exception *** core dump ***
G4WT0 >  **** Track information is not available at this moment
G4WT0 >  **** Step information is not available at this moment
G4WT0 > 
-------- EEEE -------- G4Exception-END --------- EEEE -------

G4WT0 > 
G4WT0 > *** G4Exception: Aborting execution ***

My memory is that this should be an expected behaviour in our case: for many years, we have deliberately used an offset world volume so that our detector apparatus can be placed at (0,0,0), even though it’s off to one side of the surrounding cavern structure. We do this using G4ExtrudedSolid (instead of the boolean G4DisplacedSolid wrapper), and we have never (neither in 10.07.p04 nor in recent 11.4.1 running) gotten the error above.

We only get the above error after the new “refresh” code from Bugzilla #2714 is run. This leads to two questions from me, one of which would be really awkward for us :slight_smile:

  1. Should we have been getting the GeomNav0002 fatal error all long? Is this a bug (or missing cross-check) in the Geant4 Navigator code when it’s invoked for the first time?
  2. Should this be an error at all?

We have run billions of events through our “offset” geometry, with entirely sensible tracking, visualization, trajectories, energy deposits, etc. We’ve had no indication in any analysis of coordinate or tracking errors, other than ones we caused in our own code.

Hi Mike, the check for the world volume to be centered at the origin has always been there and it is expected. It shouldn’t have anything to do with the stale pointer problem. You should check if indeed in your rebuilt geometry the translation vector of the world volume is set to (0,0,0).

That’s my point, @gcosmo. In this job, I know that the translation vector of the world volume is not set to (0,0,0). And it was not set to (0,0,0) in the initial build (specifically, the physical geometries are identical in both cases, all we changed was the electric field configuration).

So with an offset world volume, we do not get any error on the initial build, but we do get the “must be centered” error after rebuild.

Hi Michael, I tried to reproduce your problem, not with 10.7 but with the latest geant4 plus Gabriele’s patch. I did not manage to reproduce your exact GeomNav0002 failure.

But I seem to have found a different MT geometry-refresh bug. After ReinitializeGeometry(false, false), the master thread gets a new mass world, but worker threads can keep using the old mass world if it still has the same name (WorldPV). So master and workers can end up using different geometry objects after a rebuild.

Interesting; thank you, Dmitri. Following Gabriele’s comment, went back to our code and found something. We use G4ExtrudedSolid to build the world, as I noted above.
A while back, I modified the code so that the extent (“bounding box”) of the volume is symmetric, but the extrusion top and bottom sections have an offset:

 Solid geometry type: G4ExtrudedSolid
 Convex polygon; list of vertices:
    #0   vx = 46041.325 mm   vy = 15388.2 mm
    #1   vx = 46041.325 mm   vy = -15388.2 mm
    #2   vx = -46041.325 mm   vy = -15388.2 mm
    #3   vx = -46041.325 mm   vy = 15388.2 mm
 Sections:
   z = -9919.005260129745 mm    x0= 14241.925 mm    y0= 0 mm    scale= 1
   z = 9919.005260129745 mm    x0= 14241.925 mm    y0= 0 mm    scale= 1
[...]
Global transformation of volume (GlobalTrans): 
         1         0         0         0
         0         1         0         0
         0         0         1         0

(full dump attached: snolab-World-PV.txt (2.2 KB)

I suspect it’s those “x0” values which are causing the error. Even though they don’t cause any kind of error when this geometry is built and used the first time in a job.

If necessary, I should be able handle this in our code by checking for non-zero (x0,y0) values, and if found, replacing them with (0,0) and instead enlarging the polygon vertices to add more rock on one side than the other.

Hello, Michael, can you say exactly how the geometry is rebuilt after the field change? Do you call ReinitializeGeometry() directly or use /run/reinitializeGeometry? what setting do you use for destroyFirst?

Thank you! We use destroyFirst=true. We have our own /CDMS/updateGeom command, so users can do as much configuration as needed before the geometry gets built or rebuilt. That UI command triggers:

void CDMSGeomConstructor::UpdateGeometry() {
  G4RunManager* rm = G4RunManager::GetRunManager();
  rm->ReinitializeGeometry(true, false);
  rm->GeometryHasBeenModified();
  rm->InitializeGeometry();

  ActivateRadioactiveDecay();	// GRDM must have access to all volumes
}

This function hasn’t changed since we started using G4 v10 (in G4 v9, the ReinitializeGeometry() call wasn’t there). If we should be doing something different in G4 v11, we can fix that easily.

Ummm…I’m not sure why the second argument is false. The comments for ReinitializeGeometry() in 10.07.p04 say:

  //  The second parameter "prop" has to be true if this C++ method is directly
  // invoked.

Hmmm. Prior to July 2023, that second argument was omitted, and defaulted to true. I’m not sure why the “,false” was introduced.

Hello Michael,

If I understand the Geant4 comment in G4RunManager.hh (source/run/include/G4RunManager.hh, line 558) correctly, destroyFirst=true should be used only when you really want to rebuild a completely new detector from scratch. In that mode, the old solids, logical volumes, and physical volumes are deleted first. That makes sense for cases like switching to a different GDML file or changing the detector layout in a major way.

If only the field changes, and the geometry shapes and placements stay the same, then destroyFirst=true probably should not be needed. In that case, a normal geometry refresh should be enough.

P.S. From what I see locally, the destroyFirst=true path also looks more risky, because the world bookkeeping itself can get corrupted during the full destroy-and-rebuild step.

I see your point, but that would drastically complicate our simulation framework. The geometry is completely specified, and customized, via macro commands. I’d need to have every one of those commands (or the different classes that build each of the little pieces of the simulation, reach back and set (via |= or &=) a global-ish flag that says “this parameter change affects a volume”, while “this other parameter doesn’t change its volume.”

What we’ve done instead, via our “updateGeom” command, is say “the user changed something. We can’t know what specific thing they changed, but we know we have to rebuild the geometry.” So we use the destroyFirst=true flag. Before that flag existed (in Geant4 9.5 and 9.6), we directly called all of the volume store cleanups, in order to do the same thing.

Hello Michael,

I checked both rebuild paths here, and both seem to show problems…

With destroyFirst=false, I see stale world state after rebuild. The master gets the new world, but workers can keep using old world information, especially in the parallel,scoring world path.

With destroyFirst=true, I see a different problem. In 10.7, the old geometry is really deleted, but later the worker side world lookup can still walk stale freed world pointers. In the current release, I see another bad pattern: store cleanup can fail, but the world bookkeeping is still reset anyway, so the internal state becomes inconsistent.

So I am wondering if you could print a few simple things around UpdateGeometry() in CDMS:

  • PV/LV/Solid store sizes before and after ReinitializeGeometry()
  • tracking world pointer, name, and translation before and after
  • the registered world list from G4TransportationManager
  • the same again after InitializeGeometry()

Those should be straightforward; I can add verbosity within our UpdateGeometry() function itself, and print those things. The “Tracking world pointer” is just G4TransportationManager::GetTransportationManager()->GetNavigatorForTracking->GetWorldVolume(), right? That’s going to require null-pointer checks on the first pass, of course.

I’m having trouble with this one, in both 10.07.p04 and 11.4.1. G4TransportationManager has two accessor functions:

    inline std::vector<G4VPhysicalVolume*>::iterator GetWorldsIterator();
    inline std::size_t GetNoWorlds() const;

There is no way for the iterator to be useful, unless you write really kludgy non-STL code. Without access to the underlying container itself (i.e., the private fWorlds), there’s no way (see https://stackoverflow.com/questions/18555171/how-to-check-whether-the-iterator-hit-the-end-without-container) to either set the iterator to the beginning of the container, nor test that it has reached the end.

If I assume (without any documentation to confirm it!) that the iterator is returning begin(), then I can carry along a local index variable i, increment both, and use the index to test against NoOfWorlds() while the iterator gives me the object. That’s a real kludge.

How would you get the list of registered worlds?

Yes, I think your assumption is right…I agree it is not a great API, but for a debug print it should be fine…

Something like:

void DumpRegisteredWorlds()
{
  auto* tm = G4TransportationManager::GetTransportationManager();
  if (!tm) {
    G4cout << "null" << G4endl;
    return;
  }

  auto it = tm->GetWorldsIterator();
  const auto nWorlds = tm->GetNoWorlds();

  for (std::size_t i = 0; i < nWorlds; ++i, ++it) {
    auto* pv = *it;
    G4cout << "world[" << i << "] ptr=" << pv;
    if (pv) {
      G4cout << " name=" << pv->GetName()
             << " tr=" << pv->GetTranslation();
    }
    G4cout << G4endl;
  }
}

I implemented the hack I described, and the output results are not quite what I expected.

At the first call to UpdateGeometry(), I get empty information first (of course), followed by my built world:

***** UpdateGeometry() debugging
 Sizes: PVStore 0 LVStore 0 SolidStore 0
 Tracking world PV 0
 1 worlds:
 Tracking world 0: 0
***** after InitializeGeometry()
 Sizes: PVStore 9 LVStore 9 SolidStore 29
 Tracking world PV 0xa650f48c0 name World trans (0,0,0)
 1 worlds:
 Tracking world 0: 0xa650f48c0 (World)
*****

There’s no parallel world here, because in this job, we explicitly turned it off with a macro command. Anyway, after the first run, we modify something and call UpdateGeometry() again:

***** UpdateGeometry() debugging
 Sizes: PVStore 9 LVStore 9 SolidStore 33
 Tracking world PV 0xa650f48c0 name World trans (0,0,0)
 1 worlds:
 Tracking world 0: 0xa650f48c0 (World)
***** after InitializeGeometry()
 Sizes: PVStore 9 LVStore 9 SolidStore 29
 Tracking world PV 0xa7e2d91d0 name World trans (0,0,0)
 1 worlds:
 Tracking world 0: 0xa7e2d91d0 (World)
*****

The surprising part to me is that there are four extra solids in G4SolidStore.

Right after this printout is my call to ActivateRadioactiveDecay(), which calls RDM::SelectAllVolumes(). At that point, my job gets a segfault:

[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)
[libG4OpenGL.dylib] G4String::G4String(G4String const&) /Applications/GEANT4/geant4.10.07.p04/source/global/management/include/G4String.icc:50
[libG4OpenGL.dylib] G4String::G4String(G4String const&) /Applications/GEANT4/geant4.10.07.p04/source/global/management/include/G4String.icc:50
[/Applications/GEANT4/geant4.10.07.p04/lib/libG4OpenGL.dylib] G4String::G4String(G4String const&) .../G4String.icc:51
[libG4OpenGL.dylib] void std::__1::allocator<G4String>::construct[abi:nqe210106]<G4String, G4String const&>(G4String*, G4String const&) .../usr/include/c++/v1/__memory/allocator.h:154
[libG4OpenGL.dylib] void std::__1::allocator_traits<std::__1::allocator<G4String>>::construct[abi:nqe210106]<G4String, G4String const&, 0>(std::__1::allocator<G4String>&, G4String*, G4String const&) .../usr/include/c++/v1/__memory/allocator_traits.h:298
[libG4OpenGL.dylib] void std::__1::vector<G4String, std::__1::allocator<G4String>>::__emplace_back_assume_capacity[abi:nqe210106]<G4String const&>(G4String const&) .../usr/include/c++/v1/__vector/vector.h:474
[libG4OpenGL.dylib] void std::__1::vector<G4String, std::__1::allocator<G4String>>::emplace_back<G4String const&>(G4String const&) .../usr/include/c++/v1/__vector/vector.h:1148
[libG4processes.dylib] std::__1::vector<G4String, std::__1::allocator<G4String>>::push_back[abi:nqe210106](G4String const&) .../usr/include/c++/v1/__vector/vector.h:455
[libG4processes.dylib] G4RadioactiveDecay::SelectAllVolumes() .../G4RadioactiveDecay.cc:402

If I run a slightly different job, where there is no RadioactiveDecay in the physics list, then I get the messed up pointer in Transportation:

***** UpdateGeometry() debugging
 Sizes: PVStore 9 LVStore 9 SolidStore 33
 Tracking world PV 0xa8b3cb610 name World trans (0,0,0)
 1 worlds:
 Tracking world 0: 0xa8b3cb610 (World)
***** after InitializeGeometry()
 Sizes: PVStore 9 LVStore 9 SolidStore 29
 Tracking world PV 0xa8b3cb750 name World trans (0,0,0)
 1 worlds:
 Tracking world 0: 0xa8b3cb750 (World)
*****
Writing DMC results to iZIP5_0.1v_51260517_0001_DMC.txt

 *** Break *** segmentation violation
[/usr/lib/system/libsystem_platform.dylib] _sigtramp (no debug info)
[libG4geometry.dylib] G4LogicalVolume::GetSolid() const .../G4LogicalVolume.cc:405
[libG4geometry.dylib] G4TransportationManager::GetParallelWorld(G4String const&) .../G4TransportationManager.cc:185

What’s interesting here is that there isn’t a parallel world in the geometry, and I’ve confirmed that neither G4ParallelWorldProcess nor G4CoupledTransportation are in the output of /process/list.

Yep, that’s exactly what I implemented :frowning: Including the autos, which I hate (C++ is a strongly typed language, dammit!).

I’m going to tweak my testing above a little bit. I’ll turn parallel worlds back on (I hadn’t realized the test I was running turned them off), and I’m going to try using a different cavern geometry and a bigger detector build.

It is interesting, though, that there seems to be (see my long edit of my last post) a stale pointer or something even without parallel worlds, and with the Bugzilla #2714 fix patched into my 10.07.p04 build.

With parallel worlds, I’m getting a segfault from UpdateWorlds(). I’m going to add printing info directly from G4ParallelWorldStore, and I’ll report back here.

Hmmm. Adding printout in my ReportStores() function got rid of the UpdateWorlds() segfault, but it also brought back the GeomNav0002 complaint (which might be a good thing?). Anyway, here’s what I’m getting now:

Before the first run (i.e., the initial geometry build):

***** UpdateGeometry() debugging
 Sizes: PVStore 0 LVStore 0 SolidStore 0
 ParallelWorldProcessStore 0
 Tracking world PV 0
 1 worlds:
 Tracking world 0: 0
***** after InitializeGeometry()
 Sizes: PVStore 1472 LVStore 885 SolidStore 1632
 ParallelWorldProcessStore 0
 Tracking world PV 0x8b5033ed0 name World trans (0,0,0)
 2 worlds:
 Tracking world 0: 0x8b5033ed0 (World)
 Tracking world 1: 0x8b5b13d90 (Scorers)
*****
--- G4CoupledTransportation is used 

Notice that the PWProcessStore is still empty here. I guess it gets filled during physics setup? After the first run ends, we see the rebuild happen in the master thread:

***** UpdateGeometry() debugging
 Sizes: PVStore 1472 LVStore 885 SolidStore 1728
 ParallelWorldProcessStore 1
 proc 0x8b5ae1800 world Scorers
 Tracking world PV 0x8b5033ed0 name World trans (0,0,0)
 2 worlds:
 Tracking world 0: 0x8b5033ed0 (World)
 Tracking world 1: 0x8b5b13d90 (Scorers)
***** after InitializeGeometry()
 Sizes: PVStore 1472 LVStore 885 SolidStore 1632
 ParallelWorldProcessStore 1
 proc 0x8b5ae1800 world Scorers
 Tracking world PV 0x8b7ec99a0 name World trans (0,0,0)
 2 worlds:
 Tracking world 0: 0x8b7ec99a0 (World)
 Tracking world 1: 0x8b5762cb0 (Scorers)
*****

This second run starts, and then fails on the first worker thread with GeomNav0002.

I don’t see a public function in G4ParallelWorldProcess to allow us to retrieve the PV pointer.

Hello Michael,

I think there are a few separate points here.

  1. For a direct C++ call, the second argument should be true, as the Geant4 header comment says. So I would first try:

rm->ReinitializeGeometry(true);

instead of:

rm->ReinitializeGeometry(true, false);

With prop=true, the call goes through the Geant4 UI command machinery. In MT, this is the path used to pass broadcastable commands to workers.

  1. I think the extra:

rm->GeometryHasBeenModified();

is probably not needed, because ReinitializeGeometry() already does that.

  1. By design, I would expect ReinitializeGeometry() to be enough, and the next BeamOn() should initialize the geometry automatically.

  2. In practice, I understand why you call:

rm->InitializeGeometry();

immediately after it: you need the rebuilt geometry before the next run, for ActivateRadioactiveDecay() / SelectAllVolumes(). But this looks more like a workaround to me…

In my simple 10.07.p04 test, destroyFirst=true followed by InitializeGeometry() worked. In my current 11.4.1 MT test, destroyFirst=true still shows problems, even with prop=true. So fixing the second argument is the first thing I would correct, but it may not be the whole story…

I really appreciate your support on this, Dmitri! I have implemented both of the changes you suggested above.

Running a patched G4 10.07.p04 job, with mass world, parallel world, and electric field, it now runs through to completion without the GeomNav0002 error! I was able to apply three successive geometry changes, and got three valid output files, with no errors.

Running a G4 11.4.1 job (without the patch from Bugzilla #2714), I also see some unhappy behaviour. After the first run ends, I get complaints about

WARNING - Attempt to delete the assembly store while geometry closed !
WARNING - Attempt to delete the physical volume store while geometry closed !
WARNING - Attempt to delete the logical volume store while geometry closed !
WARNING - Attempt to delete the solid store while geometry closed !

and then one of our internal tools (CDMSGeometryTools, with thanks to Pedro Arce) complains that it finds two LVs with the same name, when there should only be one, and it aborts my job.

I need to pull in the Bugzilla #2714 fix for G4 v11 as well, until the next patches come out. But the complaints above are probably causing trouble even before the “stale pointer” issue.