1 | <html> |
---|
2 | <head> |
---|
3 | <title>Parallel Geant4 (ParGeant4)</title> |
---|
4 | </head> |
---|
5 | |
---|
6 | <body> |
---|
7 | <table WIDTH="100%" > |
---|
8 | <tr> |
---|
9 | <TD><a href="../../../../Overview/html/index.html"> |
---|
10 | |
---|
11 | </a><a href="index.html"> |
---|
12 | <img SRC="../../../../resources/html/IconsGIF/Contents.gif" ALT="Contents" height=16 width=59></a> |
---|
13 | <a href="ExtendedCodes.html"> |
---|
14 | <img SRC="../../../../resources/html/IconsGIF/Previous.gif" ALT="Previous" height=16 width=59></a> |
---|
15 | <img SRC="../../../../resources/html/IconsGIF/NextGR.gif" ALT="Next" height=16 width=59></td> |
---|
16 | |
---|
17 | <td ALIGN=RIGHT><b><font color="#238E23"><font size=-1>Geant4 User's Guide</font></font></b> |
---|
18 | <br><b><font color="#238E23"><font size=-1>For Application Developers</font></font></b> |
---|
19 | <br><b><font color="#238E23"><font size=-1>Examples</font></font></b></td> |
---|
20 | </tr> |
---|
21 | </table> |
---|
22 | |
---|
23 | <center> |
---|
24 | <p><b><font color="#238E23"><font size=+3>Parallel Geant4 (ParGeant4)</font></font></b> |
---|
25 | <address>by Gene Cooperman (gene@ccs.neu.edu)<br> |
---|
26 | and Viet Ha Nguyen (vietha@ccs.neu.edu) |
---|
27 | </address> |
---|
28 | </center> |
---|
29 | <p> |
---|
30 | <hr ALIGN="Center" SIZE="7%"> |
---|
31 | <h2>What is ParGeant4 ?</h2> |
---|
32 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/pargeant4.html">ParGeant4</a> |
---|
33 | is a parallel version of Geant4 that implements event-level |
---|
34 | parallelism to simulate separate events on remote processors. |
---|
35 | Typical simulations demonstrate a nearly linear speedup in running |
---|
36 | time as the number of remote processors increases. The needed |
---|
37 | enhancements of Geant4 are included in the |
---|
38 | examples/extended/parallel directory of the Geant4 distribution. |
---|
39 | </p> |
---|
40 | <p> |
---|
41 | <hr ALIGN="Center" SIZE="7%"> |
---|
42 | <h2>Why is ParGeant4 useful?</h2> |
---|
43 | |
---|
44 | When doing a large Geant4 simulation, one often wishes to run |
---|
45 | on many processors to reduce the overall time. Traditionally, |
---|
46 | this has been done by splitting the events into multiple groups, |
---|
47 | and running Geant4 independently on each processor for its own |
---|
48 | group of events. This requires restarting a run if a processor |
---|
49 | goes down. It also requires saving the histogram files from |
---|
50 | each run, and merging the files prior to using the analysis tool. |
---|
51 | The human effort in this is considerable. |
---|
52 | <p> |
---|
53 | ParGeant4 provides a much simpler mechanism. After setting up |
---|
54 | ParGeant4 one links and runs the sequential Geant4 application |
---|
55 | exactly as before, but additionally linking with some parallel libraries. |
---|
56 | Upon execution, ParGeant4 on the console sends out events to |
---|
57 | slave processes, collects all hits, and calls any analysis tool -- |
---|
58 | exactly as one would do in the sequential case. |
---|
59 | </p><p> |
---|
60 | There is no need to split events into separate groups, track |
---|
61 | whether one of the processors crashed, merge histogram files, etc. |
---|
62 | If a slave processor crashes, ParGeant automatically re-sends the |
---|
63 | events of that slave processor to a new slave processor for re-execution. |
---|
64 | </p> |
---|
65 | <p> |
---|
66 | <hr ALIGN="Center" SIZE="7%"> |
---|
67 | <h2>What is the performance of ParGeant4?</h2> |
---|
68 | |
---|
69 | As a rule of thumb, speedup will be nearly linear when each event |
---|
70 | simulation lasts for at least several milliseconds. |
---|
71 | ParGeant4 has been tested extensively on parallelizations of |
---|
72 | examples/novice/N02 and of examples/advanced/underground_physics . |
---|
73 | On N02, we see a speedup of 27 for 50 nodes and a speedup of |
---|
74 | 33 for 100 nodes. When using the <code>--aggregated-tasks=50</code> |
---|
75 | option (see below) |
---|
76 | the speedup improves to 35 for 50 nodes and 60 for 100 nodes. |
---|
77 | <p> |
---|
78 | In tests of underground_physics, events are longer and we see nearly |
---|
79 | linear speedup (94 times speedup with 100 nodes). |
---|
80 | </p> |
---|
81 | <p> |
---|
82 | <hr ALIGN="Center" SIZE="7%"> |
---|
83 | <h2>Getting started </h2> |
---|
84 | |
---|
85 | Detailed information is under extended/parallel/ParN02/docs/000README. |
---|
86 | There are four steps: |
---|
87 | <ol> |
---|
88 | <li> Install <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc.html">TOP-C</a>.</li> |
---|
89 | <li> Compile ParN02 by running <TT>gmake</TT>.</li> |
---|
90 | <li> Make sure the "procgroup" file is correct and copy it to |
---|
91 | directory of the executable binary file (for example, |
---|
92 | <TT>$G4BIN/Linux-g++</TT>).</li> |
---|
93 | <li> Run the parallel binary program.</li> |
---|
94 | </ol> |
---|
95 | </p> |
---|
96 | <p> |
---|
97 | <hr ALIGN="Center" SIZE="7%"> |
---|
98 | <h2>What is involved in setting up ParGeant4?</h2> |
---|
99 | |
---|
100 | To set up ParGeant4, one needs |
---|
101 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc.html">TOP-C</a> and |
---|
102 | <a target="ext" href="http://www.ccs.neu.edu/home/gene/marshalgen.html">Marshalgen</a> |
---|
103 | (free, open source software). |
---|
104 | If one is parallelizing a new Geant4 application, one must then add/modify |
---|
105 | approximately 20 lines of annotations (C++ comments |
---|
106 | to indicate shallow vs. deep copying of pointers, etc.) in the .h files |
---|
107 | for each hit type being defined |
---|
108 | by the application. For details of the annotations, refer to the manual of the Marshalgen package.<br> |
---|
109 | |
---|
110 | Finally, in the main routine of the application, |
---|
111 | one replaces the call to the <TT>G4RunManager</TT> constructor by a call |
---|
112 | to the <TT>ParRunManager</TT> constructor. (<TT>ParRunManger</TT> is a derived class |
---|
113 | of <TT>G4RunManager</TT>.)</p> |
---|
114 | <p> |
---|
115 | After this, one invokes the already provided |
---|
116 | GNUMakefile (a slightly modified version of the Geant4 example GNUMakefile) |
---|
117 | to create the parallel application. Finally, one writes a |
---|
118 | "procgroup" file, which declares the names of the remote hosts |
---|
119 | to use in the parallel computation. Optionally, one may also |
---|
120 | specify filenames (e.g. slave1.out, slave2.out, ...) to store |
---|
121 | the printout from each slave process. One then calls the |
---|
122 | ParGeant4 binary exactly as one would call the Geant4 binary, |
---|
123 | and the results appear as normal, only faster. |
---|
124 | </p> |
---|
125 | <p> |
---|
126 | <hr ALIGN="Center" SIZE="7%"> |
---|
127 | <h2>Are there examples of using ParGeant4?</h2> |
---|
128 | Yes. ParGeant4 includes parallelizations of |
---|
129 | other examples from the Geant4 distribution. Specifically, ParGeant4 |
---|
130 | includes example parallelizations of novice/N02, novice/N04, and |
---|
131 | advanced/underground_physics . |
---|
132 | </p> |
---|
133 | <p> |
---|
134 | <hr ALIGN="Center" SIZE="7%"> |
---|
135 | <h2>What are some of the features of ParGeant4?</h2> |
---|
136 | ParGeant4 includes all of the features of TOP-C. In particular, |
---|
137 | after building a binary, "parMySimulation", |
---|
138 | one might call:<p> |
---|
139 | </p><dl><dt> |
---|
140 | ./parMySimulation --TOPC-help |
---|
141 | </dt><dd> |
---|
142 | Display command options, and then exit. |
---|
143 | <p> |
---|
144 | </p></dd><dt> |
---|
145 | ./parMySimulation --TOPC-trace=0 <geant4 arguments=""> |
---|
146 | </geant4></dt><dd> |
---|
147 | By default, ParGeant4 traces each time a new event is sent to |
---|
148 | a task. This turns it off. |
---|
149 | <p> |
---|
150 | </p></dd><dt> |
---|
151 | ./parMySimulation --TOPC-verbose=0 <geant4 arguments=""> |
---|
152 | </geant4></dt><dd> |
---|
153 | By default, ParGeant4 provides statistics indicating what |
---|
154 | process was run, when it was run, what machine, the running |
---|
155 | times and elapsed times of master and the average slave, and |
---|
156 | other information. This turns off the statistics. |
---|
157 | <p> |
---|
158 | </p></dd><dt> |
---|
159 | ./parMySimulation --TOPC-aggregated-tasks=10 <geant4 arguments=""> |
---|
160 | </geant4></dt><dd> |
---|
161 | By default, ParGeant4 sends one event to one remote process before |
---|
162 | turning to the next process. This option sends 10 events |
---|
163 | to a single remote process in one message. This is useful |
---|
164 | when events are relatively short, and the network latency |
---|
165 | of sending a message is starting to dominate the running time. |
---|
166 | <p> |
---|
167 | </p></dd><dt> |
---|
168 | ./parMySimulation --TOPC-slave-timeout=3600 <geant4 arguments=""> |
---|
169 | </geant4></dt><dd> |
---|
170 | By default, if a remote process has not communicated with the |
---|
171 | master after 1800 seconds (a half hour), the slave process will |
---|
172 | kill itself. This prevents runaway processes that may be in an |
---|
173 | infinite loop for some event, or may have lost their socket to |
---|
174 | communicate with the master process. In this example, we allow |
---|
175 | 7200 seconds (two hours) because we expect simulation of some |
---|
176 | events to last up to (but not more than) 7200 seconds. |
---|
177 | </dd></dl> |
---|
178 | <p> |
---|
179 | By default, ParGeant4 uses its own subset implementation of MPI (MPINU). |
---|
180 | ParGeant4 adds approximately 50 KB to the "footprint" of the binary |
---|
181 | executable. By default, ParGeant4 uses "ssh" to set up remote processes. |
---|
182 | Those who wish to use their own MPI (perhaps if a batch cluster requires |
---|
183 | a specific MPI) may do so. (See "Configuring a Different `MPI')" |
---|
184 | in the <a target="ext" href="http://www.ccs.neu.edu/home/gene/topc/topc_toc.html"> |
---|
185 | TOP-C manual</a>.) |
---|
186 | |
---|
187 | </p> |
---|
188 | <hr><i><a href="../../../../Authors/html/subjectsToAuthors.html">About |
---|
189 | the authors</a></i> |
---|
190 | </body> |
---|
191 | </html> |
---|