Geant4 to Parquet

Does anybody know if Geant4 supports writing to Parquet or if there are projects that already developed a general serializer for this data format?

2 Likes

Writing and reading Parquet files can be done with Apache Arrow c++ implementation which is quite powerful and fast. I have done it myself for several projects as alternative to root and hepmc event files and output from geant4 simulation as well.

Bear in mind that Apache Arrow uses the concept of Results for many operations and, while not impossible, it could be hard to integrate with geant with small boilerplate. One useful tip is to have a clear entry point for Arrow related operations that handles Results and use Arrow macros elsewhere.

Once you have a Table or Array you can save them to a parquet file following the docs.

An important aspect is that parquet (and arrow memory layout) is a columnar format and most geant4 use cases either consume or produce data row by row (event by event), so be careful not to harm performance by going this route, you could accumulate data in standard containers, e.g std::vector, std::map, and convert them to arrow later or use the ArrayBuilder API.

1 Like