[1208] | 1 | <html> |
---|
| 2 | <head> |
---|
| 3 | <title>Parallel Geant4 (ParGeant4)</title> |
---|
| 4 | </head> |
---|
| 5 | |
---|
| 6 | <body> |
---|
| 7 | <table WIDTH="100%" > |
---|
| 8 | <tr> |
---|
| 9 | <TD><a href="../../../../Overview/html/index.html"> |
---|
| 10 | |
---|
| 11 | </a><a href="index.html"> |
---|
| 12 | <img SRC="../../../../resources/html/IconsGIF/Contents.gif" ALT="Contents" height=16 width=59></a> |
---|
| 13 | <a href="ExtendedCodes.html"> |
---|
| 14 | <img SRC="../../../../resources/html/IconsGIF/Previous.gif" ALT="Previous" height=16 width=59></a> |
---|
| 15 | <img SRC="../../../../resources/html/IconsGIF/NextGR.gif" ALT="Next" height=16 width=59></td> |
---|
| 16 | |
---|
| 17 | <td ALIGN=RIGHT><b><font color="#238E23"><font size=-1>Geant4 User's Guide</font></font></b> |
---|
| 18 | <br><b><font color="#238E23"><font size=-1>For Application Developers</font></font></b> |
---|
| 19 | <br><b><font color="#238E23"><font size=-1>Examples</font></font></b></td> |
---|
| 20 | </tr> |
---|
| 21 | </table> |
---|
| 22 | |
---|
| 23 | <center> |
---|
| 24 | <p><b><font color="#238E23"><font size=+3>Parallel Geant4 (ParGeant4)</font></font></b> |
---|
| 25 | <address>by Gene Cooperman (gene@ccs.neu.edu)<br> |
---|
| 26 | and Viet Ha Nguyen (vietha@ccs.neu.edu) |
---|
| 27 | </address> |
---|
| 28 | </center> |
---|
| 29 | <p> |
---|
| 30 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 31 | <h2>What is ParGeant4 ?</h2> |
---|
| 32 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/pargeant4.html">ParGeant4</a> |
---|
| 33 | is a parallel version of Geant4 that implements event-level |
---|
| 34 | parallelism to simulate separate events on remote processors. |
---|
| 35 | Typical simulations demonstrate a nearly linear speedup in running |
---|
| 36 | time as the number of remote processors increases. The needed |
---|
| 37 | enhancements of Geant4 are included in the |
---|
| 38 | examples/extended/parallel directory of the Geant4 distribution. |
---|
| 39 | </p> |
---|
| 40 | <p> |
---|
| 41 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 42 | <h2>Why is ParGeant4 useful?</h2> |
---|
| 43 | |
---|
| 44 | When doing a large Geant4 simulation, one often wishes to run |
---|
| 45 | on many processors to reduce the overall time. Traditionally, |
---|
| 46 | this has been done by splitting the events into multiple groups, |
---|
| 47 | and running Geant4 independently on each processor for its own |
---|
| 48 | group of events. This requires restarting a run if a processor |
---|
| 49 | goes down. It also requires saving the histogram files from |
---|
| 50 | each run, and merging the files prior to using the analysis tool. |
---|
| 51 | The human effort in this is considerable. |
---|
| 52 | <p> |
---|
| 53 | ParGeant4 provides a much simpler mechanism. After setting up |
---|
| 54 | ParGeant4 one links and runs the sequential Geant4 application |
---|
| 55 | exactly as before, but additionally linking with some parallel libraries. |
---|
| 56 | Upon execution, ParGeant4 on the console sends out events to |
---|
| 57 | slave processes, collects all hits, and calls any analysis tool -- |
---|
| 58 | exactly as one would do in the sequential case. |
---|
| 59 | </p><p> |
---|
| 60 | There is no need to split events into separate groups, track |
---|
| 61 | whether one of the processors crashed, merge histogram files, etc. |
---|
| 62 | If a slave processor crashes, ParGeant automatically re-sends the |
---|
| 63 | events of that slave processor to a new slave processor for re-execution. |
---|
| 64 | </p> |
---|
| 65 | <p> |
---|
| 66 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 67 | <h2>What is the performance of ParGeant4?</h2> |
---|
| 68 | |
---|
| 69 | As a rule of thumb, speedup will be nearly linear when each event |
---|
| 70 | simulation lasts for at least several milliseconds. |
---|
| 71 | ParGeant4 has been tested extensively on parallelizations of |
---|
| 72 | examples/novice/N02 and of examples/advanced/underground_physics . |
---|
| 73 | On N02, we see a speedup of 27 for 50 nodes and a speedup of |
---|
| 74 | 33 for 100 nodes. When using the <code>--aggregated-tasks=50</code> |
---|
| 75 | option (see below) |
---|
| 76 | the speedup improves to 35 for 50 nodes and 60 for 100 nodes. |
---|
| 77 | <p> |
---|
| 78 | In tests of underground_physics, events are longer and we see nearly |
---|
| 79 | linear speedup (94 times speedup with 100 nodes). |
---|
| 80 | </p> |
---|
| 81 | <p> |
---|
| 82 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 83 | <h2>Getting started </h2> |
---|
| 84 | |
---|
| 85 | Detailed information is under extended/parallel/ParN02/docs/000README. |
---|
| 86 | There are four steps: |
---|
| 87 | <ol> |
---|
| 88 | <li> Install <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc.html">TOP-C</a>.</li> |
---|
| 89 | <li> Compile ParN02 by running <TT>gmake</TT>.</li> |
---|
| 90 | <li> Make sure the "procgroup" file is correct and copy it to |
---|
| 91 | directory of the executable binary file (for example, |
---|
| 92 | <TT>$G4BIN/Linux-g++</TT>).</li> |
---|
| 93 | <li> Run the parallel binary program.</li> |
---|
| 94 | </ol> |
---|
| 95 | </p> |
---|
| 96 | <p> |
---|
| 97 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 98 | <h2>What is involved in setting up ParGeant4?</h2> |
---|
| 99 | |
---|
| 100 | To set up ParGeant4, one needs |
---|
| 101 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc.html">TOP-C</a> and |
---|
| 102 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/marshalgen.html">Marshalgen</a> |
---|
| 103 | (free, open source software). |
---|
| 104 | If one is parallelizing a new Geant4 application, one must then add/modify |
---|
| 105 | approximately 20 lines of annotations (C++ comments |
---|
| 106 | to indicate shallow vs. deep copying of pointers, etc.) in the .h files |
---|
| 107 | for each hit type being defined |
---|
| 108 | by the application. For details of the annotations, refer to the manual of the Marshalgen package.<br> |
---|
| 109 | |
---|
| 110 | Finally, in the main routine of the application, |
---|
| 111 | one replaces the call to the <TT>G4RunManager</TT> constructor by a call |
---|
| 112 | to the <TT>ParRunManager</TT> constructor. (<TT>ParRunManger</TT> is a derived class |
---|
| 113 | of <TT>G4RunManager</TT>.)</p> |
---|
| 114 | <p> |
---|
| 115 | After this, one invokes the already provided |
---|
| 116 | GNUMakefile (a slightly modified version of the Geant4 example GNUMakefile) |
---|
| 117 | to create the parallel application. Finally, one writes a |
---|
| 118 | "procgroup" file, which declares the names of the remote hosts |
---|
| 119 | to use in the parallel computation. Optionally, one may also |
---|
| 120 | specify filenames (e.g. slave1.out, slave2.out, ...) to store |
---|
| 121 | the printout from each slave process. One then calls the |
---|
| 122 | ParGeant4 binary exactly as one would call the Geant4 binary, |
---|
| 123 | and the results appear as normal, only faster. |
---|
| 124 | </p> |
---|
| 125 | <p> |
---|
| 126 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 127 | <h2>Are there examples of using ParGeant4?</h2> |
---|
| 128 | Yes. ParGeant4 includes parallelizations of |
---|
| 129 | other examples from the Geant4 distribution. Specifically, ParGeant4 |
---|
| 130 | includes example parallelizations of novice/N02, novice/N04, and |
---|
| 131 | advanced/underground_physics . |
---|
| 132 | </p> |
---|
| 133 | <p> |
---|
| 134 | <hr ALIGN="Center" SIZE="7%"> |
---|
| 135 | <h2>What are some of the features of ParGeant4?</h2> |
---|
| 136 | ParGeant4 includes all of the features of TOP-C. In particular, |
---|
| 137 | after building a binary, "parMySimulation", |
---|
| 138 | one might call:<p> |
---|
| 139 | </p><dl><dt> |
---|
| 140 | ./parMySimulation --TOPC-help |
---|
| 141 | </dt><dd> |
---|
| 142 | Display command options, and then exit. |
---|
| 143 | <p> |
---|
| 144 | </p></dd><dt> |
---|
| 145 | ./parMySimulation --TOPC-trace=0 <geant4 arguments=""> |
---|
| 146 | </geant4></dt><dd> |
---|
| 147 | By default, ParGeant4 traces each time a new event is sent to |
---|
| 148 | a task. This turns it off. |
---|
| 149 | <p> |
---|
| 150 | </p></dd><dt> |
---|
| 151 | ./parMySimulation --TOPC-verbose=0 <geant4 arguments=""> |
---|
| 152 | </geant4></dt><dd> |
---|
| 153 | By default, ParGeant4 provides statistics indicating what |
---|
| 154 | process was run, when it was run, what machine, the running |
---|
| 155 | times and elapsed times of master and the average slave, and |
---|
| 156 | other information. This turns off the statistics. |
---|
| 157 | <p> |
---|
| 158 | </p></dd><dt> |
---|
| 159 | ./parMySimulation --TOPC-aggregated-tasks=10 <geant4 arguments=""> |
---|
| 160 | </geant4></dt><dd> |
---|
| 161 | By default, ParGeant4 sends one event to one remote process before |
---|
| 162 | turning to the next process. This option sends 10 events |
---|
| 163 | to a single remote process in one message. This is useful |
---|
| 164 | when events are relatively short, and the network latency |
---|
| 165 | of sending a message is starting to dominate the running time. |
---|
| 166 | <p> |
---|
| 167 | </p></dd><dt> |
---|
| 168 | ./parMySimulation --TOPC-slave-timeout=3600 <geant4 arguments=""> |
---|
| 169 | </geant4></dt><dd> |
---|
| 170 | By default, if a remote process has not communicated with the |
---|
| 171 | master after 1800 seconds (a half hour), the slave process will |
---|
| 172 | kill itself. This prevents runaway processes that may be in an |
---|
| 173 | infinite loop for some event, or may have lost their socket to |
---|
| 174 | communicate with the master process. In this example, we allow |
---|
| 175 | 7200 seconds (two hours) because we expect simulation of some |
---|
| 176 | events to last up to (but not more than) 7200 seconds. |
---|
| 177 | </dd></dl> |
---|
| 178 | <p> |
---|
| 179 | By default, ParGeant4 uses its own subset implementation of MPI (MPINU). |
---|
| 180 | ParGeant4 adds approximately 50 KB to the "footprint" of the binary |
---|
| 181 | executable. By default, ParGeant4 uses "ssh" to set up remote processes. |
---|
| 182 | Those who wish to use their own MPI (perhaps if a batch cluster requires |
---|
| 183 | a specific MPI) may do so. (See "Configuring a Different `MPI')" |
---|
| 184 | in the <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc/topc_toc.html"> |
---|
| 185 | TOP-C manual</a>.) |
---|
| 186 | |
---|
| 187 | </p> |
---|
| 188 | <hr><i><a href="../../../../Authors/html/subjectsToAuthors.html">About |
---|
| 189 | the authors</a></i> |
---|
| 190 | </body> |
---|
| 191 | </html> |
---|