source: trunk/examples/extended/parallel/ParN02/000README @ 1346

Last change on this file since 1346 was 807, checked in by garnier, 16 years ago

update

File size: 8.2 KB
Line 
1
2  ParGeant4:  Geant4/TOP-C, a parallelization of Geant4
3              (event-level parallelism)
4
5          Gene Cooperman
6          Northeastern University
7          gene@ccs.neu.edu,
8
9For the latest information on ParGeant4, see:
10  http://www.ccs.neu.edu/home/gene/pargeant4.html
11Note that a version now exists that runs Geant4 over the Grid.
12Please write to gene@ccs.neu.edu for further information.
13To port other applications to a parallel version, read the
14  files ../../info/PAR_INSTALL and ../../info/PAR_README.
15
16See the beginning of GNUmakefile for reasonable `make' targets to run it.
17To run it:
180.  a. Follow the standard Geant4 installation procedure.
19    b. Download and install TOP-C
20       The TOP-C home page is at http://www.ccs.neu.edu/home/gene/topc.html
21        cd <TOPC_INSTALL_DIR>
22        gzip -dc topc.tar.gz | tar -xvf -
23        cd topc
24        ./configure
25        make
26        make check
27        [ Copy bin/topc-config to your path ]
28    c. Verify that the Geant4 example installs:
29        cd $G4INSTALL/examples/extended/parallel/ParN02
30        make
31        $G4WORKDIR/bin/$G4SYSTEM/ParN02 ParN02.in
322.  make run
33    [ By default, the included `procgroup' file creates two slave processes
34      on localhost. ]
35    [ Note that in addition to output on master,
36      $G4WORKDIR/bin/$G4SYSTEM/slave*.out contains slave output. ]
37    [ To remove intermediate files and start over:  make parclean  ]
383.  Try running it with slave processes on remote processes.
39    First, test that your local environment is set up correctly.
40    Try:
41    ssh <REMOTE_HOSTNAME> $G4WORKDIR/bin/$G4SYSTEM/ParN02 `pwd`/ParN02.in
42    The above command needs to work without asking for a password.
43    [ If you use dynamic libraries (*.so), make sure the LD_LIBRARY_PATH
44      in your shell startup file (e.g. ~.tcshrc) includes both:
45      $G4INSTALL/lib/$G4SYSTEM and $CLHEP_BASE_DIR/lib
46      If you use AFS, you may need to type 'klog' to renew your AFS token. ]
47    In `procgroup' file, replace `localhost' by desired remote hosts;
48    Add additional remote hosts (additional slaves) if you like. 
49    Then:  make run
50
51============================================================================
52If you read ParGNUmakefile, you'll find other things that you can
53modify.  For example, all TOP-C additions are in conditionals:
54  remove -DG4USE_TOPC from ParGNUmakefile and:
55  make parclean; make run
56  in order to re-compile and rerun without TOP-C.
57Define REMOTE_SHELL differently if you don't use `ssh' for a remote shell.
58  (If undefined, ParGNUmakefile defines it to be `ssh')
59Define MACROFILE diferently to use a different set of input commands.
60Define MEM_MODEL=--seq
61  to run with TOP-C, but using a single (sequential) process, suitable
62  for easy debugging (via gdb, for example).
63Try:  pushd $G4WORKDIR/bin/$G4SYSTEM/; ./ParN02 --TOPC-help
64  to see TOP-C run-time options that can be invoked, such as
65    pushd $G4WORKDIR/bin/$G4SYSTEM/; ./ParN02 --TOPC-num-slaves=5 ParN02.in
66  Alternatively, modify TOPC_OPTIONS in ParGNUmakefile for the same effect.
67
68You can also try other targets:  make run-debug
69  This will run it under gdb, so you can single step to see what happens.
70make parclean - Start over with clean set of files.
71
72============================================================================
73New or modified files:
74  ParN02.cc -     Adds one line:  #include "ParN02.icc"
75                  ParExample.icc inserts:  #include "topc.h"
76                  and causes main to calls TOPC_init, TOPC_finalize,
77                  and to use: `new ParRunManager' instead of `new G4RunManager'
78  GNUmakefile   - Adds one line at beginning:  include ParGNUmakefile
79                  ParGNUmakefile defines EXTRALIBS and CPPFLAGS so as to
80                    modify behavior of config/binmake.gmk
81                    in order to use TOP-C libraries and includes
82  procgroup     - Specifies which slave hosts to use, and where to put output
83                  For example:  localhost 1 - > slave1.out
84                                host=`localhost', executable=`same as master',
85                                params of slave=`> slave1.out' (redirect output)
86                  If output not redirected, it goes to stdout on master.
87  src/ParRunManager.cc - ParRunManger derived from G4RunManager
88                  replaces Gr4RunManager::DoEventLoop w/ TOP-C parallel loop,
89                  Adds certain local vars of DoEventLoop as ParRunManager members
90  include/MarshaledObj.h - run-time utilities for marshalling
91  include/MarshaledEx*Hit.h - marshals N02 hits (calorimeter hits)
92  include/MarshaledG4*.h - marshaling routines for Geant4 data structures
93
94  ~/slave*.out  - Contains outputs of slave1, slave2, etc.
95                  Generated each time parallel ParN02 is executed.
96                  These files are specified in the file procgroup.
97
98====================================================================
99This version passes an event number to the slave and lets the
100slave generate the event.  The slave passes back marshaled hits to
101the master.
102
103I will integrate the track level parallelism into this scenario at
104a later date.  For the track level, I will generate several
105secondary tracks on the master, and then convert the secondary tracks
106to new events that can be passed to slaves.  I will do this only if
107I detect that there are not enough initial events to fully occupy all
108the slaves.  This scheme has the drawback that we are splitting an event
109into many events, which may make the summarization, histogram, and so
110on more difficult.  However, track level parallelism will be triggered
111only when a very small number of events are generated.
112
113I also want to support postponing
114a track to the next event ( G4ClassificationOfNewTrack::fPostpone .
115To do this, each slave will wait to retire an event until it knows that
116the previous event has been retired.
117
118In addition, I plan to have only the master read commands and pass
119them to the slaves.  Currently, the master and slaves each read
120identical commands.
121
122====================================================================
123If you are curious about some of the layers, the following
124stack trace [somewhat out of date now] gives some idea.
125   G4RunManager::BeamOn calls ParRunManager::DoEventLoop
126(since G4RunManager::DoEventLoop is virtual)
127  ParRunManager::DoEventLoop calls TOPC_master_slave
128  TOPC_master_slave calls submit_task_input
129  submit_task_input eventually calls COMM_send_msg which calls MPI_Send
130(COMM_send_msg is the communication layer of TOPC;
131 ParN02.cc was linked with the TOP-C MPI communication layer.
132 The same source could have been linked with a POSIX threads layer,
133 a communication layer, or some other communication layer.
134)
135  MPI_send calls send
136(where send is the socket system call of libc.so)
137
138(gdb) where
139#0  0x41946c62 in send () from /lib/libc.so.6
140#1  0x400839c1 in send () at wrapsyscall.c:186
141#2  0x805c547 in MPI_Send (buf=0x82690fc, count=4, datatype=3, dest=2, tag=1, comm=0) at sendrecv.c:236
142#3  0x805a0b5 in COMM_send_msg (msg=0x82690fc, msg_size=4, dst=2, tag=TASK_INPUT_TAG) at comm-mpi.c:224
143#4  0x805774e in send_task_input (slave=2, input={data = 0x82690fc, data_size = 4}, tag=TASK_INPUT_TAG) at topc.c:560
144#5  0x8057aa8 in submit_task_input (input={data = 0x82690fc, data_size = 4}) at topc.c:659
145#6  0x805813c in TOPC_master_slave (generate_task_input_=0x4003d2e4 <ParRunManager::GenerateEventInput(void)>,
146    do_task_=0x4003d350 <ParRunManager::DoEvent(int *)>, check_task_result_=0x4003d420 <ParRunManager::CheckEventResult(int *, void *)>,
147    update_shared_data_=0) at topc.c:922
148#7  0x4003d18c in ParRunManager::DoEventLoop (this=0x80c0bf0, n_event=1, macroFile=0x0, n_select=-1) at src/ParRunManager.cc:51
149#8  0x400b14d1 in G4RunManager::BeamOn () from /afs/cern.ch/user/c/cooperma/scratch-pcitapi07/geant4/lib/libG4run.so
150#9  0x400b870a in G4RunMessenger::SetNewValue () from /afs/cern.ch/user/c/cooperma/scratch-pcitapi07/geant4/lib/libG4run.so
151#10 0x4167157b in G4UIcommand::DoIt () from /afs/cern.ch/user/c/cooperma/scratch-pcitapi07/geant4/lib/libG4intercoms.so
152#11 0x416810a3 in G4UImanager::ApplyCommand () from /afs/cern.ch/user/c/cooperma/scratch-pcitapi07/geant4/lib/libG4intercoms.so
153#12 0x805db7e in G4UIterminal::ExecuteCommand () at /afs/cern.ch/sw/lhcxx/specific/redhat61/3.2.0/include/CLHEP/Random/Randomize.h:64
154#13 0x805d42d in G4UIterminal::SessionStart () at /afs/cern.ch/sw/lhcxx/specific/redhat61/3.2.0/include/CLHEP/Random/Randomize.h:64
155#14 0x8056a3d in main (argc=1, argv=0x80bfa00) at ParN02.cc:98
Note: See TracBrowser for help on using the repository browser.