Should Geant4 play nicely with Valgrind?

I’m trying to chase down an apparent memory leak in my G4 application, so I turned to the obvious candidate, Valgrind (Massif). I’m still using G4 10.7.4 with our experiment, and I have it built on two different systems:

  • CentOS7 with GCC 12.2.0
  • MacOS 10.15.7 (Catalina) with LLVM (Apple Clang 12.0.0)

In both cases, Valgrind generates its whiny “Illegal instruction” claim:

==6684== Your program just tried to execute an instruction that Valgrind
==6684== did not recognise.  There are two possible reasons for this.
==6684== 1. Your program has a bug and erroneously jumped to a non-code
==6684==    location.  If you are running Memcheck and you just saw a
==6684==    warning about a bad jump, it's probably your program's fault.
==6684== 2. The instruction is legitimate but Valgrind doesn't handle it,
==6684==    i.e. it's Valgrind's fault.  If you think this is the case or
==6684==    you are not sure, please let us know and we'll try to fix it.
==6684== Either way, Valgrind will now raise a SIGILL signal which will
==6684== probably kill your program.

I also tried this with examples/basic/B1, and with a third small standalone test that links against G4. All of them are giving the same complaint on both platforms.

Is this something others have seen? Does G4 just not work with Valgrind any more?

It normally works. We use it regularly on Linux. You should make sure the Valgrind tool is built with the same compiler you’re using and eventually with support for multithreading enabled.

That’s what I was expecting; thanks, Gabriele! I wonder if it’s the multithreading that’s the issue. I’ll follow up with our cluster admins.

Ugh. Do you happen to know the build option for MT? The Valgrind documentation online doesn’t appear to have any information about how to build it, just how to use it, and even the README included in the distribution doesn’t have anything specific about configuration options.

The build options in the latest version of Valgrind you may want to use are --enable-tls and, if you use LTO linkage, --enable-lto. But the most important thing is that you make sure the tool is built with the same compiler you’re using to build your application.

Thank you for this suggestion, Gabriele! We definitely have Valgrind built with the same compiler (the module system used on our cluster enforces that by only letting us use packages from a given “toolchain”, with a single compiler, system library set, etc.).

I built Valgrind with those two options set, but I’m getting the same kind of “illegal instruction”, a.k.a. “instruction Valgrind did not recognize” error. I tried with massif, memcheck, and helgrind (see https://www.it.uc3m.es/pbasanta/asng/course_notes/helgrind_tool_en.html), and I’m seeing the same thing.

I won’t bother you with the details. It sounds to me like Valgrind should work just fine with Geant4, so this must be something in our cluster environment.