| [3158] | 1 | -------------------------------------------------------------------------------------
 | 
|---|
 | 2 |  Comparaison performance de differentes machines en compilation / execution (calcul)
 | 
|---|
 | 3 |                                    -----------------------
 | 
|---|
 | 4 |     Mesures effectuees en Janvier 2007 ,       R. Ansari / C. Magneville
 | 
|---|
 | 5 | -------------------------------------------------------------------------------------
 | 
|---|
 | 6 | 
 | 
|---|
 | 7 | (a) eros3 : Bipro-bicoeur Xeon@2.4 GHz Linux (xeon-lx-2.4GHz)  , gcc 3.2
 | 
|---|
 | 8 | (b) ccali : Bipro-bicoeur Xeon@2.8 GHz Linux (xeon-lx-2.8GHz)  , icc 8.0 ou 9.0
 | 
|---|
| [3244] | 9 |      Flags de compilation avec [-O -g]      
 | 
|---|
| [3273] | 10 | (bb) grid49 : Bipro-bicoeur nouveau Xeon 5140 @ 2.33 GHz
 | 
|---|
 | 11 | (bb4) grid50 : Bipro-quadri-coeur nouveau Xeon E5345 @ 2.33 GHz --> 8 coeurs
 | 
|---|
| [3179] | 12 | (c) sgsda: AMD Bipro AMD opteron 248 @ 2.2 GHz (amd-lx-
 | 
|---|
| [3244] | 13 |      Flags de compilation avec [-O -g]      
 | 
|---|
| [3241] | 14 | (cc) grid-saclay: AMD opteron 275 Bipro-bicoeur  @ 2.2 GHz (amd275-lx)
 | 
|---|
| [3244] | 15 |      Flags de compilation avec [-O -g]      
 | 
|---|
| [3158] | 16 | 
 | 
|---|
 | 17 | (d) asc: bipro alpha (@ ~1 GHz) server DS20 OSF (osf1)  , cxx 6.5 (osf-asc)
 | 
|---|
 | 18 | (nouveau asc 420 MFLOPS, moins puissante que l'ancien asc 800 MFLOPS)
 | 
|---|
 | 19 | (e) xp1000-dapnia: alpha xp1000 @ ~ 600 MHz ? OSF1 , cxx ? (osf-xp1000)
 | 
|---|
| [3241] | 20 | (e') cool: alpha xp1000 @ ~ 667 MHz ? OSF1 5.1 , cxx 6.3 
 | 
|---|
| [3158] | 21 | (f) superosf-dapnia: multi-proc alphaServer ES80 6 procs EV7 @ 1 GHz (super-osf)
 | 
|---|
 | 22 | 
 | 
|---|
| [3259] | 23 | (g) ccsvx01: XServe G5 bipro @~1.8-2GHz (Darwin/OSX) (G5-osx-2GHz) , gcc 3.3
 | 
|---|
| [3158] | 24 | (h) PowerBook-Reza : Apple G4 @ 1.25 GHz (G4-osx-1.25GHz) , gcc 3.3
 | 
|---|
 | 25 | (i) MacBook-Reza: Apple/ Core double-coeur Intel @ 1.83 GHz (core-osx-1.83GHz) gcc 4
 | 
|---|
 | 26 | (j) MacPro-Grosdidier : Apple / Xeon 2 double-coeur @ 3 GHz gcc 4.0.1 , compil SOPHYA -O2 -g
 | 
|---|
 | 27 | 
 | 
|---|
| [3187] | 28 | (p) IBM-AIX regatta , xlC , IBM eServer pSeries 655 , 8 proc power4 @ 1.1 GHz
 | 
|---|
| [3241] | 29 | (q) IBM-AIX meso , AIX 5.3, xlC V8 , IBM Power5 , 8 proc bi-coeur P575 @ 1.9 GHz 
 | 
|---|
| [3158] | 30 | 
 | 
|---|
| [3241] | 31 | (s) SGI-IRIX64 magique, CC 
 | 
|---|
| [3158] | 32 | 
 | 
|---|
 | 33 | NOTES : 
 | 
|---|
 | 34 | - Sur les machines Xeon, il y a une interaction entre process / threads par rapport a 
 | 
|---|
 | 35 | l'occupation des CPU's. On perd un facteur 3 en performance multi-threads/multi-taches.
 | 
|---|
 | 36 | La machine MacPro avec OSX se debrouille quand meme mieux.
 | 
|---|
 | 37 | - Effet du systeme ou carte mere ??? 
 | 
|---|
 | 38 | 
 | 
|---|
| [3244] | 39 | Flag de compilation 
 | 
|---|
 | 40 | - Flag de compilation par defaut [-O -g] en general
 | 
|---|
 | 41 | - Sur eros3 (xeon-linux gcc 3.3) [-O -g] OU [-O3 -g]
 | 
|---|
 | 42 | - Sur Darwin [-g] ou [-O2 -g] (ou [-tune G5] sur XServe G5)
 | 
|---|
 | 43 |    Sur les mac (en particulier G4/G5), grande difference entre -g et -Ox -g
 | 
|---|
 | 44 |    mais peu de difference entre -O -O2 -O3  
 | 
|---|
 | 45 | - Sur machine aix-meso [-O -g] ou [-O3 -g]
 | 
|---|
| [3187] | 46 | 
 | 
|---|
| [3251] | 47 | X/ Performances brutes cpupower et donnees SPEC ((http//www.spec.org) 
 | 
|---|
 | 48 | ----------------------------------------------------------------------
 | 
|---|
| [3244] | 49 | 
 | 
|---|
| [3251] | 50 | (1) MFLOPS  -> cpupower 2   (x/y : -O -g / -O3) 
 | 
|---|
 | 51 | SPECint2000 (3) / SPECfp2000 (2) (http//www.spec.org) 
 | 
|---|
 | 52 | 
 | 
|---|
| [3258] | 53 | X.1/ Performances en calcul double
 | 
|---|
| [3251] | 54 | csh> cpupower 0 3000000  5 
 | 
|---|
 | 55 |      3 10^6 operations doubles - sur memoire 3x3 10^6 doubles (~50 MO)
 | 
|---|
 | 56 |       ===> ~ 24 MO / MFLOPS
 | 
|---|
 | 57 | csh> cpupower 2
 | 
|---|
 | 58 |      1.6 10^9 operations doubles - sur 3x20000 doubles (~0.5 MO)
 | 
|---|
 | 59 | 
 | 
|---|
 | 60 | 
 | 
|---|
 | 61 | Compilation avec -O  (optimisation)
 | 
|---|
 | 62 |   (1) cpupower 0 : debit memoire en MO/s
 | 
|---|
 | 63 |   (2) cpupower 0  , MFLOPS   
 | 
|---|
 | 64 |   (5) cpupower 2 ,  MFLOPS 
 | 
|---|
 | 65 | 
 | 
|---|
 | 66 | Compilation avec -g (debug / sans optimisation)
 | 
|---|
 | 67 |   (3) cpupower 0  , MFLOPS  
 | 
|---|
 | 68 |   (6) cpupower 2  , MFLOPS 
 | 
|---|
 | 69 | 
 | 
|---|
 | 70 | Compilation avec -O3 ou -fast ...( optimisation poussee) 
 | 
|---|
 | 71 |   (4) cpupower 0  , MFLOPS  
 | 
|---|
 | 72 |   (7) cpupower 2  , MFLOPS 
 | 
|---|
 | 73 | 
 | 
|---|
 | 74 | 
 | 
|---|
 | 75 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 76 |         MFLOPS       |(1) MO/s|   (2)     (3)      (4)   |    (5)       (6)       (7)      
 | 
|---|
 | 77 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 78 | (a)xeon-lx-2.4GHz    | 1290   |   53      53       55    |    338       340       320
 | 
|---|
| [3266] | 79 | (b)xeon-lx-2.8GHzicc | 2040   |   85      80       83    |    914       409       914
 | 
|---|
| [3273] | 80 | (bb)n-xeon-grid49    | 3000   |   125     85       130   |    660       500       660
 | 
|---|
 | 81 | (bb)n-xeon4c-grid50  | 2568   |   107     103      109   |    655       500       655
 | 
|---|
| [3258] | 82 | (c)amd-lx            | 1560   |   65      77       68    |    666       314       686 
 | 
|---|
| [3251] | 83 | (cc)amd2-lx          |        |
 | 
|---|
 | 84 | 
 | 
|---|
 | 85 | (e')osf-cool         |  768   |   32      15       32    |    630       150       660     
 | 
|---|
 | 86 | (f)superosf               
 | 
|---|
 | 87 | 
 | 
|---|
 | 88 | (g)G5-osx-1 GHz      | 2100   |   88      68       88    |   1000       255      1073
 | 
|---|
 | 89 | (f)G4-osx-1.25GHz    |  600   |   25      16       25    |    417        93       430 
 | 
|---|
| [3258] | 90 | (i)core-osx-1.83GHz  | 2500   |  107      75      107    |    855       309       884
 | 
|---|
| [3251] | 91 | (j)xeon-osx            
 | 
|---|
 | 92 | 
 | 
|---|
| [3266] | 93 | (p)ibm-aix-regatta   | 3100   |  130      55      133    |    730       115      1750  (32 bits)
 | 
|---|
 | 94 | (p)ibm-aix-meso      | 5700   |  240      70      320    |   1500       220      3400  (32 bits)
 | 
|---|
 | 95 | 
 | 
|---|
 | 96 | (s)sgi-magique       |  336   |  14       7       15     |    340        40       460  (32 bits)  
 | 
|---|
| [3251] | 97 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 98 | 
 | 
|---|
| [3258] | 99 | X.2/  Comparaison performances int, float double 
 | 
|---|
 | 100 |   cpupower compile avec -O 
 | 
|---|
 | 101 | 
 | 
|---|
 | 102 | (1) float , cpupowerF 0 3000000 5 / cpupowerF 2
 | 
|---|
 | 103 |     -> MFLOPS (puissance de calcul sur float)
 | 
|---|
 | 104 | (2) double, cpupowerD 0 3000000 5 / cpupowerF 2  (idem tableau X.1)
 | 
|---|
 | 105 |     -> MDBLOPS (puissance de calcul sur float)
 | 
|---|
 | 106 | (3) int, cpupowerI 0 3000000 5 / cpupowerI 2
 | 
|---|
| [3266] | 107 |     -> MINTOPS  (puissance de calcul sur int=4 bytes)
 | 
|---|
| [3258] | 108 | (4) long (ou long long (*)) cpupowerL 0 3000000 5 / cpupowerL 2
 | 
|---|
| [3266] | 109 |     -> MLONOPS  (puissance de calcul sur long=8 bytes)
 | 
|---|
| [3258] | 110 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 111 |         MFLOPS       |   (1)MFLOPS       (2)MDBLOPS       (3)MINTOPS       (4)MLONOPS 
 | 
|---|
 | 112 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 113 | (a)xeon-lx-2.4GHz    | 
 | 
|---|
| [3273] | 114 | (b)xeon-lx-2.8GHzicc |    166/905         90/900           166/1500         88/522    (*)
 | 
|---|
 | 115 | (bb)nxeon-grid49     |    250/1030        125/660          250/2500         125/2280
 | 
|---|
 | 116 | (bb4)xeon-4c-grid50  |    207/1019        110/660          207/2460         107/2285
 | 
|---|
| [3258] | 117 | (c)amd-lx            |    125/695         65/675           125/1570         65/1045
 | 
|---|
 | 118 | (cc)amd2-lx          | 
 | 
|---|
 | 119 | 
 | 
|---|
| [3259] | 120 | (e')osf-cool         |    60/635          32/631            62/640          31/630
 | 
|---|
| [3258] | 121 | (f)superosf               
 | 
|---|
 | 122 | 
 | 
|---|
| [3259] | 123 | (g)G5-osx-1 GHz      |    180/1260        90/1150           165/940         81/280    (*)
 | 
|---|
 | 124 | (f)G4-osx-1.25GHz    |    45/430          25/410            45/710          24/190    (*)
 | 
|---|
 | 125 | (i)core-osx-1.83GHz  |    185/919         105/855           187/935         62/246    (*)
 | 
|---|
| [3258] | 126 | (j)xeon-osx            
 | 
|---|
 | 127 | 
 | 
|---|
 | 128 | (p)ibm-aix-regatta   | 
 | 
|---|
| [3266] | 129 | (p)ibm-aix-meso      |    250/1150        250/1500          250/1200        50/200     (32 bits)
 | 
|---|
 | 130 |                      |    280/1500        250/1600          250/1100        210/1000   (64 bits -q64)
 | 
|---|
 | 131 | 
 | 
|---|
 | 132 | (s)sgi-magique       |  
 | 
|---|
| [3258] | 133 | ----------------------------------------------------------------------------------------------
 | 
|---|
 | 134 | 
 | 
|---|
 | 135 | X.3/  Comparaison avec SPEC 
 | 
|---|
| [3266] | 136 | csh>  cpupower 0 / cpupower 2 
 | 
|---|
| [3187] | 137 | ----------------------------------------------------------------------
 | 
|---|
 | 138 |                          MFLOPS(1)      SPECfp      SPECint 
 | 
|---|
 | 139 | ----------------------------------------------------------------------
 | 
|---|
| [3266] | 140 | (b)xeon-lx-2.8GHz         166/900        1400        1400
 | 
|---|
 | 141 | (c)amd-lx                 125/690        1600        1300
 | 
|---|
| [3187] | 142 | (cc)amd2-lx               675            1600        1300
 | 
|---|
 | 143 | 
 | 
|---|
| [3266] | 144 | (e)osf-xp1000             32/650         500         400
 | 
|---|
 | 145 | (f)superosf               842            1100        700
 | 
|---|
| [3187] | 146 | 
 | 
|---|
| [3266] | 147 | (i)core-osx-1.83GHz       110/850        1400        1500    
 | 
|---|
 | 148 | (j)xeon-osx               2600           2900          -
 | 
|---|
| [3187] | 149 | 
 | 
|---|
| [3266] | 150 | (p)ibm-aix-regatta        130/700        1050        700     
 | 
|---|
| [3187] | 151 | ----------------------------------------------------------------------
 | 
|---|
 | 152 | 
 | 
|---|
 | 153 | 
 | 
|---|
| [3158] | 154 | A/ Compilation tout SOPHYA 
 | 
|---|
 | 155 | ----------------------------
 | 
|---|
 | 156 | csh> time make all   (1)
 | 
|---|
 | 157 | ou 
 | 
|---|
 | 158 | csh> time make -j 2 all  (2)
 | 
|---|
 | 159 |   Temps CPU 
 | 
|---|
 | 160 |   Indice de performance 100*(1000/TCPU) 
 | 
|---|
 | 161 |   Temps elapsed (vrai)
 | 
|---|
 | 162 |   Temps vrai / TCPU
 | 
|---|
 | 163 | 
 | 
|---|
 | 164 | 
 | 
|---|
 | 165 | ----------------------------------------------------------------------
 | 
|---|
 | 166 |                          CPU(s)  IndPerf   TElapsed , TCPU/Elapsed %
 | 
|---|
 | 167 | ----------------------------------------------------------------------
 | 
|---|
 | 168 | (a)xeon-lx-2.4GHz (2)    615 s      162       410 s        150%  
 | 
|---|
 | 169 |       avec -O3 -g (2)   1300 s       77       760 s        172%  
 | 
|---|
 | 170 | (b)xeon-lx-2.8GHz (2)    755 s      132       540 s        140%
 | 
|---|
| [3241] | 171 | (c)amd-lx         (2)    336 s      297       175 s        192%
 | 
|---|
| [3158] | 172 | 
 | 
|---|
| [3241] | 173 | (d)osf-asc (1)          1920 s       52      2340 s        83%   (??)
 | 
|---|
 | 174 | (e)osf-xp1000 (1)        533 s      187       660 s        80%
 | 
|---|
 | 175 | (f)superosf (1)          895 s      112       910 s        98%
 | 
|---|
| [3158] | 176 | 
 | 
|---|
| [3259] | 177 | (g)G5-osx-2GHz (2)       453 s      221       250 s        182%
 | 
|---|
| [3251] | 178 |     -tune=G5            1100 s       90
 | 
|---|
 | 179 |     -g -O                740 s                380 s        195%
 | 
|---|
 | 180 | (h)G4-osx-1.25GHz (1)    660 s      151       710 s        93%   [-g]
 | 
|---|
| [3244] | 181 |                         1500 s                             94%   [-O2 -g]
 | 
|---|
| [3158] | 182 | (i)core-osx-1.83GHz (2)  209 s      478       116 s        180%
 | 
|---|
| [3159] | 183 |               -O2   (1)  367 s      272       381          96%    
 | 
|---|
| [3158] | 184 | (j)xeon-osx
 | 
|---|
 | 185 | 
 | 
|---|
 | 186 | (p)ibm-aix
 | 
|---|
 | 187 | ----------------------------------------------------------------------
 | 
|---|
 | 188 | 
 | 
|---|
 | 189 | Taille shared libs : 
 | 
|---|
| [3251] | 190 | (a)
 | 
|---|
 | 191 | (c) 33 MO   
 | 
|---|
| [3158] | 192 | (f) = (e) = 57 MO 
 | 
|---|
| [3251] | 193 | (g) 80 MO
 | 
|---|
| [3158] | 194 | (i) 83 MO
 | 
|---|
 | 195 | 
 | 
|---|
| [3187] | 196 | B/ Calcul brut (Tableaux de SOPHYA) avec / sans threads
 | 
|---|
 | 197 | --------------------------------------------------------
 | 
|---|
| [3266] | 198 | B.1.a/   Calcul sur vecteur 10 * V2 ~= DLO4 (V1) 
 | 
|---|
 | 199 |          ~ 10 x 10 x 9. 10^6 operations double sur 2 x  9 10^6 double    
 | 
|---|
 | 200 |          900 M.Ops r_8 / ~ 1500 MO 
 | 
|---|
| [3158] | 201 | 
 | 
|---|
| [3266] | 202 | (1) time cpupower 0     # compile avec -O  (/ -O -g)
 | 
|---|
 | 203 | (2) time zthr arrdl 1 3000   1 thread
 | 
|---|
 | 204 | (3) time zthr arrdl 2 3000   2 thread
 | 
|---|
 | 205 | (4) time zthr arrdl 4 3000   4 thread
 | 
|---|
 | 206 | (5) time zthr arrdl 6 3000   6 thread
 | 
|---|
 | 207 | (6) time zthr arrdl 8 3000   8 thread
 | 
|---|
 | 208 | 
 | 
|---|
 | 209 | -----------------------------------------------------------------------------------
 | 
|---|
 | 210 |                      (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/%
 | 
|---|
 | 211 | -----------------------------------------------------------------------------------
 | 
|---|
 | 212 | (a)xeon-lx-2.4GHz      53        
 | 
|---|
| [3273] | 213 | (b)xeon-lx-2.8GHz      85       2.6/2.6/100%    5.3/2.9/180%   14.3/4.86/310% 
 | 
|---|
| [3266] | 214 |                                    (5) 23/7.4/314%
 | 
|---|
| [3273] | 215 | (bb)nxeon-grid49       125      2/2/99%         4/2/186%       8/2.45/326%
 | 
|---|
 | 216 |                                    (5) 12/3.6/330%   (6) 16/4.7/340%
 | 
|---|
 | 217 | (bb4)xeon-4c-grid50    110      2.2/2.2/99%     4.2/2.3/185%   8/2.5/321%
 | 
|---|
 | 218 |                                    (5) 13/3.1/470%   (6) 21/3.8/544%
 | 
|---|
| [3266] | 219 | (c)amd-lx              95        
 | 
|---|
 | 220 |                                  
 | 
|---|
 | 221 |                                   
 | 
|---|
 | 222 | (e')osf-cool           32       5.7/5.8/98%     11.1/11.3/98%   22.3/22.5/98% 
 | 
|---|
 | 223 | (f)superosf                    
 | 
|---|
 | 224 | 
 | 
|---|
 | 225 | (g)G5-osx-2GHz         88       2.5/2.6/99%     5.9/3.38/184%    11/6.45/173%    [-O2 -g]
 | 
|---|
 | 226 | (h)G4-osx-1.25GHz      25       6.6/7/95%       13.4/13.8/97%                    [-O2 -g]
 | 
|---|
 | 227 | (i)core-osx-1.83GHz   107       2.1/2.1/98%     4.3/2.9/150%     8.3/30/31%      [-O2 -g]
 | 
|---|
 | 228 | (j)xeon-osx           
 | 
|---|
 | 229 | 
 | 
|---|
 | 230 | (p)ibm-aix-regatta    130        
 | 
|---|
| [3270] | 231 | (q)ibm-aix-meso       150       0.7/1/70%       1.2/2./60%       3.2/2/150%   [-O3]
 | 
|---|
 | 232 |                                   (5) 5.4/3/180%    (6) 6.4/3/210% 
 | 
|---|
| [3266] | 233 | 
 | 
|---|
 | 234 | (s)sgi-magique          7       78/78/99%       167/95/175%      339/96/352%     [-O -g: NON-OPT]
 | 
|---|
 | 235 |                        14       16.4/16.5/99%   33.8/22.4/150%   79/32/250%      [-O -g2 OPT]
 | 
|---|
 | 236 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 237 | 
 | 
|---|
 | 238 | B.1.b/   Calcul sur vecteur V2 = Sin(V1) + Cos(V1) 
 | 
|---|
 | 239 |          ~ 50 x 9. 10^6 operations double sur 2 x  9 10^6 double, mem ~ 150 MO 
 | 
|---|
 | 240 |          ~500 M.Ops r_8 / ~ 600 MO I/O
 | 
|---|
 | 241 | 
 | 
|---|
 | 242 | (1) time cpupower 0     # compile avec -O  (/ -O -g)
 | 
|---|
 | 243 | (2) time zthr arrmf 1 3000   1 thread
 | 
|---|
 | 244 | (3) time zthr arrmf 2 3000   2 thread
 | 
|---|
 | 245 | (4) time zthr arrmf 4 3000   4 thread
 | 
|---|
 | 246 | (5) time zthr arrmf 6 3000   6 thread
 | 
|---|
 | 247 | (6) time zthr arrmf 8 3000   8 thread
 | 
|---|
 | 248 | 
 | 
|---|
 | 249 | -----------------------------------------------------------------------------------
 | 
|---|
 | 250 |                      (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/%
 | 
|---|
 | 251 | -----------------------------------------------------------------------------------
 | 
|---|
 | 252 | (a)xeon-lx-2.4GHz      53        
 | 
|---|
| [3273] | 253 | (b)xeon-lx-2.8GHz      85       1.7/1.7/100%    3.5/2.1/173%     9.8/3.6/275%
 | 
|---|
 | 254 |                                    (5) 12/3.6/330%    (6) 16/4.7/340%
 | 
|---|
 | 255 | (bb)nxeon-grid49       125      1.6/1.6/100%    3.2/1.7/183%     6.7/2.1/314% 
 | 
|---|
 | 256 |                                    (5) 10.1/3.2/320%  (6) 14.4/4.05/330%
 | 
|---|
| [3266] | 257 | (c)amd-lx              95        
 | 
|---|
 | 258 |                                  
 | 
|---|
 | 259 | (e')osf-cool           32       4.2/4.3/98%     8.2/8.4/98%      16.1/16.2/98%
 | 
|---|
 | 260 | (f)superosf                    
 | 
|---|
 | 261 | 
 | 
|---|
 | 262 | (g)G5-osx-2GHz         88       2.3/2.3/100%      5/3/165%        9.6/5.8/167%  [-O2 -g]
 | 
|---|
 | 263 | (h)G4-osx-1.25GHz      25       4.5/4.8/95%       10.9/14.6/72%        [-O2 -g]
 | 
|---|
 | 264 | (i)core-osx-1.83GHz   107       2.3/2.3/98%       4.8/3.1/158%                [-O2 -g]
 | 
|---|
 | 265 | (j)xeon-osx           
 | 
|---|
 | 266 | 
 | 
|---|
 | 267 | (p)ibm-aix-regatta    130        
 | 
|---|
| [3270] | 268 | (q)ibm-aix-meso       150       1./2/50%         2.8/3/86%       5.4/4/130%   [-O3]
 | 
|---|
 | 269 |                                      (5) 10/4/250%    (6) 11.2/5/220%% 
 | 
|---|
| [3266] | 270 | 
 | 
|---|
 | 271 | (s)sgi-magique         7       11.5/11.7/99%     24/17/140%      51.5/18.4/280%  [-O -g NON-OPT]
 | 
|---|
 | 272 |                       14       6.5/6.6/99%       13.3/12/110%    34.5/17.3/200%  [-O -g3 OPT]
 | 
|---|
 | 273 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 274 | 
 | 
|---|
 | 275 | 
 | 
|---|
 | 276 | B.1.c/ Version corrige de zthr.cc (apres 23/05/07) 
 | 
|---|
| [3254] | 277 |          arr = (c1*a1) + (c2*a2) 
 | 
|---|
 | 278 |          ~ 3 x 4. 10^6 operations int_4 sur 3 x 4 10^6 int_4    
 | 
|---|
 | 279 |          12 M.Ops int_4 / ~ 50 MO 
 | 
|---|
 | 280 | 
 | 
|---|
 | 281 | (1) time cpupower 0     # compile avec -O  (/ -O -g)
 | 
|---|
 | 282 | (2) time zthr arr 1 2000   1 thread
 | 
|---|
 | 283 | (3) time zthr arr 2 2000   2 thread
 | 
|---|
 | 284 | (4) time zthr arr 4 2000   4 thread
 | 
|---|
 | 285 | (5) time zthr arr 6 2000   6 thread
 | 
|---|
 | 286 | (6) time zthr arr 8 2000   8 thread
 | 
|---|
 | 287 | 
 | 
|---|
 | 288 | -----------------------------------------------------------------------------------
 | 
|---|
 | 289 |                      (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/%
 | 
|---|
 | 290 | -----------------------------------------------------------------------------------
 | 
|---|
 | 291 | (a)xeon-lx-2.4GHz      53        0.5/1/43%      1/1.1/88%      2.8/1/262%
 | 
|---|
 | 292 |                                     (5) 4.5/1.8/246%      (6) 6.1/2.1/310% 
 | 
|---|
 | 293 |                                   
 | 
|---|
 | 294 |         
 | 
|---|
 | 295 | (b)xeon-lx-2.8GHz      65        
 | 
|---|
 | 296 |                                   
 | 
|---|
 | 297 | (c)amd-lx              95        0.23/1/22%     0.44/1/51%       1/1/102%     [-O -g]
 | 
|---|
 | 298 |                                      (5) 1.6/1/106%   (6) 2.2/1.2/100% 
 | 
|---|
 | 299 | 
 | 
|---|
 | 300 | 
 | 
|---|
 | 301 |                                   
 | 
|---|
 | 302 | (e')osf-cool           32        0.43/1.2/35%   0.6/1.33/44%     1.1/1.3/82%      [-O -g]
 | 
|---|
 | 303 |                                      (5) 1.45/1.7/85%   (6) 1.83/2.16/84%         
 | 
|---|
 | 304 | (f)superosf                    
 | 
|---|
 | 305 | 
 | 
|---|
| [3259] | 306 | (g)G5-osx-2GHz         88       1.5/1.5/100%    3.2/1.7/185%      6.6/3.5/188%    [-O -g]
 | 
|---|
 | 307 | (g)G5-osx-2GHz         88       0.4/1/40%       0.9/1.0/90%       2/1.2/169%      [-tune=G5 -g]
 | 
|---|
| [3254] | 308 |                                      (5) 3.3/2/164%    (6) 4.3/2.6/165%
 | 
|---|
 | 309 | (h)G4-osx-1.25GHz      25       3/3/95%                                           [-O2 -g]
 | 
|---|
 | 310 |                                   
 | 
|---|
 | 311 | (i)core-osx-1.83GHz               [-O2 -g]
 | 
|---|
 | 312 | 
 | 
|---|
 | 313 | (j)xeon-osx           
 | 
|---|
 | 314 | 
 | 
|---|
 | 315 | 
 | 
|---|
 | 316 | (p)ibm-aix-regatta   130        
 | 
|---|
 | 317 | 
 | 
|---|
 | 318 | (q)ibm-aix-meso      150        0.6/1/58%       1/1/91%           1.7/1.2/132%    [-O3]
 | 
|---|
 | 319 |                                      (5) 2.4/1.2/193%   (6) 4.25/1.6/265%      
 | 
|---|
 | 320 | 
 | 
|---|
 | 321 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 322 | 
 | 
|---|
| [3266] | 323 | B.1.x/ ancienne version de zthr (avant 23/05/07) 
 | 
|---|
| [3254] | 324 |          On faisait 2 multiplications par ctye suivi d'un produit matriciel !
 | 
|---|
 | 325 |          arr = c1*a1*c2*a2   ( ~ 3 10^6 op. double)
 | 
|---|
| [3241] | 326 | (1) time cpupower 2     # compile avec -O3  (/ -O -g)
 | 
|---|
| [3158] | 327 | (2) time zthr arr 1 1000   1 thread
 | 
|---|
 | 328 | (3) time zthr arr 2 1000   2 thread
 | 
|---|
 | 329 | (4) time zthr arr 4 1000   4 thread
 | 
|---|
 | 330 | (5) time zthr arr 6 1000   6 thread
 | 
|---|
 | 331 | (6) time zthr arr 8 1000   8 thread
 | 
|---|
 | 332 | 
 | 
|---|
 | 333 | -----------------------------------------------------------------------------------
 | 
|---|
 | 334 |                      (1)MFLOPS  (2)CPU/Elap/% (2)IndPerf  (3)CPU/Elap/% (3)IndPerf
 | 
|---|
 | 335 | -----------------------------------------------------------------------------------
 | 
|---|
 | 336 | (a)xeon-lx-2.4GHz     1167        5.15/5.2/99%              11.4/5.8/196%
 | 
|---|
 | 337 |                                            (4)36.6/9.28/394%
 | 
|---|
 | 338 |         -O3 -g                    4.9/5./99%
 | 
|---|
 | 339 | (b)xeon-lx-2.8GHz      920        2.3/2.3/100%              6.2/3.1/198%
 | 
|---|
 | 340 |                                            (4)26/6.6/396%
 | 
|---|
 | 341 | (c)amd-lx              690        3.6/3.6/99%               6.8/4/171%
 | 
|---|
 | 342 |                                            (4)13.5/7/193%
 | 
|---|
 | 343 |                                            (5)20.3/10.23/198%
 | 
|---|
| [3187] | 344 | (cc)amd2-lx            675        2/2/99%                   4.15/2.1/197%
 | 
|---|
 | 345 |                                            (4)8.25/4.15/198%
 | 
|---|
 | 346 |                                            (5)13.6/4.6/292%
 | 
|---|
 | 347 |                                            (6)19.8/6.5/300%
 | 
|---|
| [3158] | 348 | 
 | 
|---|
| [3241] | 349 | (d)osf-asc             420        6.3s/6.5s/99%            16.9/8.8/192%   
 | 
|---|
 | 350 |                                            (4)29.9/15.7/191%
 | 
|---|
 | 351 | (e)osf-xp1000          648        5.1/5.3/96.6%            11.4/11.4/99%       
 | 
|---|
| [3158] | 352 |                                            (4)25.2/25.5/99%
 | 
|---|
| [3241] | 353 | (f)superosf            842        2.87/2.88/99.6%          6.25/4.1/153%
 | 
|---|
| [3158] | 354 |                                            (4)11.6/3.06/379%          
 | 
|---|
 | 355 | 
 | 
|---|
| [3251] | 356 | (h)G4-osx-1.25GHz       92        44s/48s/91%              86.7/99.8/92%  [-g]
 | 
|---|
| [3244] | 357 |                        380        12.2/12.9/95%            24/25.3/95%    [-O2 -g]                   
 | 
|---|
| [3259] | 358 | (g)G5-osx-2GHz        1151        20s/20s/99%              40s/23s/170%
 | 
|---|
| [3158] | 359 |                                            (4) 80.8/45/180%
 | 
|---|
| [3251] | 360 |    -O -g                          4.5/4.9/91%              9.3/4.7/197%
 | 
|---|
 | 361 |                                            (4) 18.3/9.4/197%
 | 
|---|
| [3159] | 362 |    -tune=G5                       3.35/3.8/88%             7.1/3.6/196%
 | 
|---|
| [3251] | 363 | (h)G4-osx-1.25GHz       92        44s/48s/91%              86.7/99.8/92%  [-g]
 | 
|---|
 | 364 |                        380        12.2/12.9/95%            24/25.3/95%    [-O2 -g]                   
 | 
|---|
| [3159] | 365 |                                            (4) 14/7.5/187%
 | 
|---|
| [3244] | 366 | (i)core-osx-1.83GHz    855        11.5/11.5/100%           23/11.6/192%   [-g]
 | 
|---|
| [3158] | 367 |                                            (4) 46/23/199%
 | 
|---|
| [3244] | 368 |               -O2                 3.85/3.89/99%            7.7/3.9/198%   [-O2 -g]
 | 
|---|
| [3159] | 369 |                                            (4) 15.4/7.77/198%
 | 
|---|
 | 370 | 
 | 
|---|
| [3158] | 371 | (j)xeon-osx           2600        2.5/2.5/100%             5.1/2.6/199%
 | 
|---|
 | 372 |                                            (4) 11.5/3.2/362%
 | 
|---|
 | 373 |                                            (5) 17.4/4.77/365%
 | 
|---|
 | 374 | 
 | 
|---|
| [3241] | 375 | (p)ibm-aix-regatta  1750/730      6.8/6.9/98%              13.1/6.75/195%
 | 
|---|
| [3158] | 376 |                                            (4) 26.3/11.7/225%
 | 
|---|
| [3241] | 377 | (q)ibm-aix-meso     3600/1250     3.6/3.75/96%             7.35/3.7/197%
 | 
|---|
 | 378 |                                            (4) 12.46/4.2/298%
 | 
|---|
 | 379 |                                            (5) 219/6.7/280%
 | 
|---|
 | 380 |                                            (6) 24/4.5/530%
 | 
|---|
| [3158] | 381 | 
 | 
|---|
| [3241] | 382 | 
 | 
|---|
 | 383 | (s)sgi-magique         460        60/60/99%       
 | 
|---|
| [3158] | 384 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 385 | 
 | 
|---|
 | 386 | 
 | 
|---|
 | 387 | B.2/ Multiplication de matrices mtx = mtx1 * mtx2 
 | 
|---|
| [3244] | 388 |      ~ 2  10^9 op. double / thread
 | 
|---|
| [3241] | 389 | (1) time cpupower 2  (-O3 / -O -g)
 | 
|---|
| [3158] | 390 | (2) time zthr mtx 1 1000   1 thread
 | 
|---|
 | 391 | (3) time zthr mtx 2 1000   2 thread
 | 
|---|
 | 392 | (4) time zthr mtx 4 1000   4 thread
 | 
|---|
 | 393 | (5) time zthr mtx 6 1000   6 thread
 | 
|---|
 | 394 | (6) time zthr mtx 8 1000   8 thread
 | 
|---|
 | 395 | 
 | 
|---|
 | 396 | -----------------------------------------------------------------------------------
 | 
|---|
 | 397 |                      (1)MFLOPS  (2)CPU/Elap/% (2)IndPerf  (3)CPU/Elap/% (3)IndPerf
 | 
|---|
 | 398 | -----------------------------------------------------------------------------------
 | 
|---|
 | 399 | (a)xeon-lx-2.4GHz     1167        6.5/6.5/100%               17.4/8.8/198%
 | 
|---|
 | 400 |                                            (4) 80.5/20.3/397%
 | 
|---|
 | 401 |                                            (5) 114.5/29.6/387%
 | 
|---|
 | 402 |                                            (6) 160/40.3/387%
 | 
|---|
 | 403 |                                            
 | 
|---|
 | 404 | (b)xeon-lx-2.8GHz      920        3.4/3.4/100%               12/6.1/199%
 | 
|---|
 | 405 |                                            (4) 55.8/14/400%
 | 
|---|
| [3270] | 406 |                                            (5) 79.5/20.3/392%
 | 
|---|
 | 407 |                                            (6) 102/25.8/396%
 | 
|---|
| [3273] | 408 | (bb)nxeon-grid49       660        4.3/4.3/100%               9.3/4.7/200%
 | 
|---|
 | 409 |                                            (4) 28.8/7.3/390%
 | 
|---|
 | 410 |                                            (5) 41/10.4/393%
 | 
|---|
 | 411 |                                            (6) 57.5/14.45/397%
 | 
|---|
 | 412 | (bb4)xeon-4c-grid50    660        4.7/4.7/100%               10.7/5.15/199%
 | 
|---|
 | 413 |                                            (4) 27.8/7.1/391%
 | 
|---|
 | 414 |                                            (5) 57.4/10.6/540%
 | 
|---|
 | 415 |                                            (6) 129/16.8/776%
 | 
|---|
| [3158] | 416 | (c)amd-lx              690        6.98/6.98/100%             14.1/8.15/173%
 | 
|---|
 | 417 |                                            (4) 27.7/14.23/194%
 | 
|---|
 | 418 |                                            (5) 41.4/21.07/196%
 | 
|---|
 | 419 |                                            (6) 55.4/27.9/198.7%  
 | 
|---|
| [3187] | 420 | (cc)amd2-lx            675        4.1/4.1/100%               9.55/4.8/198%
 | 
|---|
 | 421 |                                            (4) 20/10.27/195%
 | 
|---|
 | 422 |                                            (5) 32.8/11.16/294%
 | 
|---|
 | 423 |                                            (6) 42.75/13.8/309%
 | 
|---|
| [3158] | 424 | 
 | 
|---|
| [3187] | 425 | 
 | 
|---|
| [3241] | 426 | (d)osf-asc             420        13.5s/13.7s/98%            32/16.5/194%   
 | 
|---|
 | 427 |                                            (4) 67.5/34.4/196%
 | 
|---|
 | 428 | (e)osf-xp1000          648        13/14.1/92%                27.1/27.4/99%
 | 
|---|
| [3158] | 429 |                                            (4) 54/54.7/99.6%
 | 
|---|
 | 430 |                                            (5) 80.6/81/99.6%
 | 
|---|
 | 431 |                                            (6) 107.8/108.3/99.5%
 | 
|---|
| [3270] | 432 | (e')osf-cool                      13/13.22/98%               26/26.1/99%
 | 
|---|
 | 433 |                                            (4) 51.8/51.9/99%
 | 
|---|
| [3241] | 434 | (f)superosf            842        6.1/7.24/84%               12.35/6.29/196%
 | 
|---|
| [3158] | 435 |                                            (4) 24.3/6.31/385%
 | 
|---|
 | 436 |                                            (5) 36.5/10.9/335%
 | 
|---|
 | 437 |                                            (6) 50.1/18.15/276%
 | 
|---|
 | 438 | 
 | 
|---|
| [3259] | 439 | (g)G5-osx-2GHz        1151        23/23.7/97%                46.5/27.5/170%
 | 
|---|
| [3158] | 440 |                                            (4) 93.4/49.4/189%
 | 
|---|
| [3251] | 441 |   -O -g                           6.2/6.2/100%                14.2/7.2/197%
 | 
|---|
 | 442 |                                            (4) 28.3/14.36/197%
 | 
|---|
| [3159] | 443 |   -tune=G5                        5.7/5.8/98%                13.3/6.8/197%
 | 
|---|
 | 444 |                                            (4) 26.8/13.56/197%
 | 
|---|
 | 445 |                                            (6) 53.8/27.25/197%
 | 
|---|
| [3251] | 446 | (h)G4-osx-1.25GHz      333        23.5/24.5/96%                              [-O2]
 | 
|---|
| [3158] | 447 | (i)core-osx-1.83GHz    855        12.6/12.7/100%             25.8/13.4/194% 
 | 
|---|
 | 448 |                                            (4) 51.6/26/199%
 | 
|---|
| [3159] | 449 |             -O2                   4.25/4.5/94%               10.6/5.36/198%
 | 
|---|
 | 450 |                                            (4) 20.87/10.68/198%
 | 
|---|
 | 451 |       -O2 2 jobs //           2 x 5/5.4/92%
 | 
|---|
| [3158] | 452 | (j)xeon-osx           2600        2.8/2.8/99%                9.3/4.66/199%
 | 
|---|
 | 453 |                                            (4) 31.4/8.6/364%
 | 
|---|
 | 454 |                                            (5) 47.1/12.96/364%
 | 
|---|
 | 455 |                                            (6) 62.8/17.38/362%
 | 
|---|
 | 456 | 
 | 
|---|
| [3241] | 457 | (p)ibm-aix-regatta  1750/730      9.5/9.7/98%                18.3/16.0/114%
 | 
|---|
| [3158] | 458 |                                            (4) 38.3/24.7/155%
 | 
|---|
| [3241] | 459 | (p)ibm-aix-meso     3600/1250     2.3/2.3/99%                5.1/2.64/194%   (compil avec -O3)
 | 
|---|
 | 460 |                                            (4) 11.4/4.16/272%
 | 
|---|
 | 461 |                                            (5) 20.2/5.85/344%
 | 
|---|
 | 462 |                                            (6) 29.9/6.74/442%
 | 
|---|
| [3179] | 463 | 
 | 
|---|
| [3266] | 464 | (s)sgi-magique         400        44/44.3/99%                96.5/55/176%
 | 
|---|
| [3179] | 465 | 
 | 
|---|
| [3158] | 466 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 467 | 
 | 
|---|
| [3254] | 468 | 
 | 
|---|
 | 469 | B.4/ Operations sur tableaux doubles- mesures avec spar 
 | 
|---|
 | 470 |   csh> time spar 2 1 2000 2000
 | 
|---|
 | 471 |   (1) cpupower 2  MFLOPS
 | 
|---|
 | 472 |   (2) MFLOPS (double) spar 
 | 
|---|
 | 473 |   (3) time spar 2 5 1000 2000 CPU/Elap/% 
 | 
|---|
 | 474 | -----------------------------------------------------------------------------------
 | 
|---|
 | 475 |                      (1)MFLOPS      (2)CPU / %         (3)CPU/Elap/%
 | 
|---|
 | 476 | -----------------------------------------------------------------------------------
 | 
|---|
 | 477 | (a)xeon-lx-2.4GHz      53       ~ 20-35 MFLOPS , 90%     20/20.2/99%       [-g -O] 
 | 
|---|
 | 478 |                                   
 | 
|---|
 | 479 |         
 | 
|---|
 | 480 | (b)xeon-lx-2.8GHz      65        
 | 
|---|
 | 481 |                                   
 | 
|---|
 | 482 | (c)amd-lx              95       ~ 20-40 MFLOPS , 99%     17.2/17.2/100%    [-g -O] 
 | 
|---|
 | 483 | 
 | 
|---|
 | 484 | 
 | 
|---|
 | 485 | (d)osf-asc                     
 | 
|---|
 | 486 |                                   
 | 
|---|
 | 487 | (e)osf-xp1000          32       ~ 15-25 MFLOPS , 90%     37.6/41.2/91%      [-g -O]  
 | 
|---|
 | 488 | (f)superosf                    
 | 
|---|
 | 489 | 
 | 
|---|
| [3259] | 490 | (g)G5-osx-2GHz         88       ~ 10-25 MFLOPS , 99%     45/45/100%         [-g -O] ou [-g -O2]
 | 
|---|
| [3254] | 491 | (h)G4-osx-1.25GHz      25       ~ 8-16  MFLOPS , 92%     45.5/52/90%        [-g -O2]
 | 
|---|
 | 492 |                                   
 | 
|---|
 | 493 | (i)core-osx-1.83GHz             
 | 
|---|
 | 494 | 
 | 
|---|
 | 495 | (j)xeon-osx           
 | 
|---|
 | 496 | 
 | 
|---|
 | 497 | 
 | 
|---|
 | 498 | (p)ibm-aix-regatta   130        
 | 
|---|
 | 499 | 
 | 
|---|
 | 500 | (q)ibm-aix-meso      150        ~ 80-100 MFLOPS , 90%   5./23/22%     [-O3]    
 | 
|---|
 | 501 | 
 | 
|---|
 | 502 | 
 | 
|---|
 | 503 | 
 | 
|---|
 | 504 | (s)sgi-magique         460        
 | 
|---|
 | 505 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 506 | 
 | 
|---|
| [3258] | 507 | B.5/  Calcul/comparaison avec JET/tjet 
 | 
|---|
 | 508 | csh> time tjet 10 2000 2000   OU tjet 10 2000 1000
 | 
|---|
 | 509 |  (1) TCPU EltAccess C/pointeurs 
 | 
|---|
 | 510 |  (2) TCPU m1*c1+m2*c2+m3*c3 C/pointeurs 
 | 
|---|
 | 511 |  (3) TCPU EltAccess  SOPHYA
 | 
|---|
| [3259] | 512 |  (4) TCPU m1*c1+m2*c2+m3*c3 SOPHYA  / Methodes (MultCst, AddArr ...)
 | 
|---|
| [3258] | 513 |  (5) TCPU m1*c1+m2*c2+m3*c3 SOPHYA JET
 | 
|---|
| [3254] | 514 | 
 | 
|---|
| [3258] | 515 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 516 |                             (1)        (2)         (3)         (4)           (5)             
 | 
|---|
 | 517 |  -----------------------------------------------------------------------------------
 | 
|---|
 | 518 | (b)xeon-lx-2.8GHz-icc       0.87       0.63        1.55      2.7/1.6         0.57
 | 
|---|
 | 519 | (c)amd-lx                   0.94       0.79        1.85      3.4/2.1         0.76
 | 
|---|
 | 520 | 
 | 
|---|
| [3259] | 521 | (e')osf-cool                2.85       2.45        3.1       6.5/5.5         4.1 
 | 
|---|
 | 522 | 
 | 
|---|
 | 523 | (g)G5-osx-2GHz              1.5        0.61        2.1       4.1/1.6         0.6   (-g -O2)
 | 
|---|
 | 524 |     -tune=G5 -fast          1.1        0.62        1.3       4.1/1.6         0.58  
 | 
|---|
 | 525 | (h)G4-osx-1.25GHz           3.86       2.2         5         9.4/6.2         3     (-g -O2)
 | 
|---|
| [3258] | 526 | (i)core-osx-1.83GHz         1.1        0.49        1.6       2.8/1.7         0.68
 | 
|---|
 | 527 | 
 | 
|---|
 | 528 | (q)ibm-aix-meso             0.43       0.27        0.52      1.12/0.75       0.35  
 | 
|---|
| [3266] | 529 | 
 | 
|---|
 | 530 | (s)sgi-magique              2.45       1.9         5.65      7.45/6.3        2.8  (-O -g3)
 | 
|---|
| [3258] | 531 | -----------------------------------------------------------------------------------
 | 
|---|
 | 532 | 
 | 
|---|
 | 533 | 
 | 
|---|
| [3187] | 534 | C/ Calcul fft (FFTW , FFTPack )
 | 
|---|
 | 535 | -------------------------------
 | 
|---|
| [3158] | 536 | 
 | 
|---|
 | 537 | (1) time cpupower 2 
 | 
|---|
 | 538 | (2) time tfft 2000000 W d 0 0 (avec FFTW)
 | 
|---|
 | 539 | (3) time tfft 2000000 P d 0 0 (avec FFTPack_Sophya)
 | 
|---|
 | 540 | 
 | 
|---|
 | 541 | IndPerf=1000/TCPU  
 | 
|---|
 | 542 | 
 | 
|---|
 | 543 | 
 | 
|---|
 | 544 | -----------------------------------------------------------------------------------
 | 
|---|
 | 545 |                      (1)MFLOPS     (2)CPU/Elap/%   (2)IndPerf   (3)CPU/Elap/%  (3)IndPerf
 | 
|---|
 | 546 | -----------------------------------------------------------------------------------
 | 
|---|
 | 547 | (a)xeon-lx-2.4GHz     1167        5.5/5.6/97%        180         7.4/7.4/100%     135    
 | 
|---|
 | 548 |        -O3 -g                                                    6.8/7.1/96%      147    
 | 
|---|
| [3270] | 549 | 
 | 
|---|
 | 550 | (b)xeon-lx-2.8GHz      920        3.6/3.7/98%                    3.7/3.8/99%
 | 
|---|
 | 551 |                                                           ~2x    6.5/8.8/73%
 | 
|---|
 | 552 |                                                           ~4x    8/14/55%   (15 sec elapsed)
 | 
|---|
| [3273] | 553 | (bb)nxeon-grid49       660                                       3.3/3.3/99% 
 | 
|---|
 | 554 |                                                           ~2x    3.3/3.4/100%  (4 sec elapsed)
 | 
|---|
 | 555 |                                                           ~4x    4.6/4.6/99%   (5 sec elapsed)
 | 
|---|
 | 556 |                                                           ~8x    4.5/4.5/50%   (9 sec elapsed)
 | 
|---|
 | 557 | (bb4)xeon-4c-grid50    660                                       3.6/3.6/99% 
 | 
|---|
 | 558 |                                                           ~2x    3.6/3.6/100%  (4 sec elapsed)
 | 
|---|
 | 559 |                                                           ~4x    4.4/4.4/99%   (5 sec elapsed)
 | 
|---|
 | 560 |                                                           ~8x    7.1/7.1/99%   (7  sec elapsed)
 | 
|---|
 | 561 |                                                           ~16x   7.1/14.4/50%   (15 sec elapsed)
 | 
|---|
| [3270] | 562 |          
 | 
|---|
| [3241] | 563 | (c)amd-lx              690        3.2/3.75/86%                   4.2/4.2/100%     238
 | 
|---|
| [3159] | 564 |                                                             ~2x  4.7/4.7/99%
 | 
|---|
| [3187] | 565 | (cc)amd2-lx            675        2.8/2.8/99%                    3.56/3.58/99%
 | 
|---|
| [3158] | 566 | 
 | 
|---|
| [3241] | 567 | (d)osf-asc             420        13.3/13.9/95.7%     75        12.2/17.6/70%     82
 | 
|---|
 | 568 | (e)osf-xp1000          648        9.9/10.2/97%       101        9.3/9.46/98.5%    107     
 | 
|---|
| [3270] | 569 | (e')cool                          9.2/10.1/91%                  9.1/9.3/98%
 | 
|---|
| [3241] | 570 | (f)superosf            842        6./6.22/96.7%      166        5.1/5.18/97.4%    196     
 | 
|---|
| [3158] | 571 | 
 | 
|---|
| [3259] | 572 | (g)G5-osx-2GHz        1151        8.5/8.7/97.5%      120        11.5/11.6/99%     87
 | 
|---|
| [3251] | 573 |     -O -g                         5.1/5.2/98%        190        5.4/5.5/99%       180
 | 
|---|
 | 574 |     -tune=G5                      5/5.1/98%          200        5.5/5.6/97%       180
 | 
|---|
 | 575 | (h)G4-osx-1.25GHz       92        15.2/15.9/96%       66        23.8/34.1/70%     42    [-g]
 | 
|---|
| [3244] | 576 |                        380        8.8/10.3/86%                  14.7/20.9/65%           [-O2 -g]
 | 
|---|
| [3158] | 577 | (i)core-osx-1.83GHz    855        4.6/4.75/97%       217        7/7.08/99%        142
 | 
|---|
| [3159] | 578 |          -O2                      3.2/3.2/99%        312        3.6/3.6/99%       277
 | 
|---|
 | 579 |    -O2 2 jobs //              2 x 3.9/4.3/90%                 2x  4.7/5/91%       200     
 | 
|---|
| [3158] | 580 | (j)xeon-osx           2600                                      2.6/2.6/98%       384
 | 
|---|
| [3159] | 581 |        2 jobs //                                           ~2x  3.6/4.1/87%       250
 | 
|---|
 | 582 |        4 jobs //                                           
 | 
|---|
| [3158] | 583 | 
 | 
|---|
| [3241] | 584 | (p)ibm-aix-regatta 1750/730       6.25/18.9/33%      160        5.25/15.7/33%     190
 | 
|---|
 | 585 | (q)ibm-aix-meso    3600/1250      3.95/4.3/91%       250        3.82/4./94%       260
 | 
|---|
 | 586 |        2 jobs //                                           ~2x  3.88/4.2/92%      250
 | 
|---|
| [3270] | 587 |                                   2.8/4.76/59%                  2.1/4.45/47%     [-O3]
 | 
|---|
 | 588 |                                                           ~2x   2.5/4.7/54%
 | 
|---|
 | 589 |                                                           ~4x   2.5/4.8/55% 
 | 
|---|
| [3158] | 590 | 
 | 
|---|
| [3241] | 591 | 
 | 
|---|
 | 592 | (s)sgi-magique         460        22/22/98%           42         24.5/25/99%       40           
 | 
|---|
| [3158] | 593 |  -----------------------------------------------------------------------------------
 | 
|---|
| [3187] | 594 | 
 | 
|---|
 | 595 | 
 | 
|---|
| [3241] | 596 | D/ Calcul inversion par lapack
 | 
|---|
 | 597 | -------------------------------
 | 
|---|
| [3187] | 598 | 
 | 
|---|
| [3241] | 599 | lpk inverse 1000,1000 0 
 | 
|---|
 | 600 | ---> temps de calcul inversion par lapack 
 | 
|---|
 | 601 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 602 |              CPU/Elap/%       (1)         
 | 
|---|
 | 603 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 604 | (a)xeon-lx-2.4GHz          5.6/~100%    
 | 
|---|
| [3270] | 605 | (b)xeon-lx-2.8GHz          5.34/~90%
 | 
|---|
| [3241] | 606 | (c)amd-lx                  5.5/5.5/99%      
 | 
|---|
| [3187] | 607 | 
 | 
|---|
| [3241] | 608 | (d)osf-asc 
 | 
|---|
 | 609 | (e')cool                   2.8/2.9/95% 
 | 
|---|
 | 610 | (f)superosf           
 | 
|---|
 | 611 | 
 | 
|---|
| [3244] | 612 | (f)G4-osx-1.25GHz          2.3/~100%   [-O2 -g]
 | 
|---|
| [3259] | 613 | (h)G5-osx-2GHz             0.8/~100%
 | 
|---|
| [3251] | 614 |     -O -g                  0.86/~100%            
 | 
|---|
| [3245] | 615 | (i)core-osx-1.83GHz        1.93/~100%
 | 
|---|
| [3241] | 616 |               -O2   
 | 
|---|
 | 617 | (p)ibm-aix-regatta        
 | 
|---|
 | 618 | (q)ibm-aix-meso            0.55/~100%
 | 
|---|
 | 619 | 
 | 
|---|
| [3266] | 620 | (s)sgi-magique             5.3/~90%
 | 
|---|
| [3241] | 621 | --------------------------------------------------------------------------------------------
 | 
|---|
 | 622 | 
 | 
|---|
 | 623 | 
 | 
|---|
 | 624 | K/ Efficacite de gestion de lock (mutex) avec les threads et tableaux 
 | 
|---|
| [3187] | 625 | ----------------------------------------------------------------------
 | 
|---|
| [3241] | 626 | (32 threads - operant sur 2000 vecteurs ~ 64000 lock/unlock/wait/broadcast)
 | 
|---|
 | 627 | 
 | 
|---|
 | 628 | (1) time zthr syncp 32 2000 4 
 | 
|---|
 | 629 | (2) time zthr sync 32 2000 4 
 | 
|---|
 | 630 | (1) time zthr syncp 4 15000 130 
 | 
|---|
 | 631 | (2) time zthr sync  4 15000 130 
 | 
|---|
 | 632 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 633 |              CPU/Elap/%       (1)             (2)              (3)              (4) 
 | 
|---|
 | 634 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 635 | (a)xeon-lx-2.4GHz         23.5/14/168%    4.3/1.2/365%     7.9/5.5/142%      4/2.15/190%
 | 
|---|
 | 636 |       Avant ThSafeOp      17/178%
 | 
|---|
| [3187] | 637 |       avec -O3 -g 
 | 
|---|
 | 638 | (b)xeon-lx-2.8GHz (2)     
 | 
|---|
| [3273] | 639 | (bb4)xeon-4c-grid50       1/1/96%         1.6/1.2/133%     4.9/4.05/120%     6/5/122%
 | 
|---|
| [3241] | 640 | (c)amd-lx                 0.6/1/63%        0.6/1/60%       3.5/3.5/102%      2.6/2.7/98% 
 | 
|---|
| [3187] | 641 | 
 | 
|---|
| [3241] | 642 | (d)osf-asc               4.5/3.4/132%     3.35/2/170%     15.8/10.5/150%     13/8/163%
 | 
|---|
 | 643 |      5.4/100%(NoThSafe)
 | 
|---|
 | 644 | (e')cool                 1.3/1.37/95%     1.35/1.5/89%     5.3/5.3/99%       5.2/5.2/99%    
 | 
|---|
| [3187] | 645 | (e)superosf (1)          
 | 
|---|
| [3251] | 646 | 
 | 
|---|
 | 647 | 
 | 
|---|
| [3259] | 648 | (g)G5-osx-2GHz (2)     2.6/130% (NoThSafe)                   
 | 
|---|
| [3251] | 649 |     -O -g                2.6/1.7/150%      6.8/3.7/187%    4/2.75/142%       4.7/3/155%
 | 
|---|
 | 650 | (h)G4-osx-1.25GHz (1)    40.5/42.6/95%    42.2/43.5/97%      [-g]
 | 
|---|
| [3244] | 651 |                           3.9/4.3/89%      3.5/4/89%       3.8/4/95%         4.3/4.6/93%
 | 
|---|
| [3187] | 652 | 
 | 
|---|
| [3245] | 653 | (i)core-osx-1.83GHz      7.7/7.1/108%     7.8/6.7/116%    30.2/29.6/102%    30.3/29.2/104%   [-g]
 | 
|---|
 | 654 |               -O2        2.7/1.8/152%     6/3.16/190%     3.4/2.4/142%      3.2/2.5/164%     [-O2 -g]
 | 
|---|
| [3241] | 655 | (j)xeon-osx               
 | 
|---|
 | 656 |       Avant ThSafeOp      2.55/143%
 | 
|---|
| [3187] | 657 | 
 | 
|---|
| [3241] | 658 | (p)ibm-aix-regatta        4.7/111%
 | 
|---|
 | 659 | (q)ibm-aix-meso           7.5/2.8/300%     17/3.8/450%    8.2/3.05/270%      4.85/2.43/200%    
 | 
|---|
 | 660 | 
 | 
|---|
 | 661 | --------------------------------------------------------------------------------------------
 | 
|---|
 | 662 | 
 | 
|---|
 | 663 | 
 | 
|---|
 | 664 | 
 | 
|---|
 | 665 | L/ I/O et PPF 
 | 
|---|
 | 666 | -----------------
 | 
|---|
 | 667 | Ecriture/lecture de n=10^7 lignes de int+6double, Total ~ 500 MO 
 | 
|---|
 | 668 | (1) time tstdtable w xx.ppf swap 10000000 1024 0
 | 
|---|
 | 669 | (2) time tstdtable r xx.ppf swap 10000000 1024 0
 | 
|---|
 | 670 | (3) time tstdtable w xx.ppf swap 50000000 1024 0
 | 
|---|
 | 671 | (4) time tstdtable r xx.ppf swap 50000000 1024 0
 | 
|---|
 | 672 | 
 | 
|---|
 | 673 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 674 |              CPU/Elap/%       (1)              (2)              (3)                (4)
 | 
|---|
 | 675 | -------------------------------------------------------------------------------------------
 | 
|---|
 | 676 | (a)xeon-lx-2.4GHz          17/26/63%       5.5/5.6/94%        
 | 
|---|
| [3270] | 677 | 
 | 
|---|
 | 678 | (b)xeon-lx-2.8GHz          7/18.5/40%      2.7/2.8/100%                                   7000000
 | 
|---|
 | 679 | 
 | 
|---|
| [3241] | 680 | (c)amd-lx                  5.9/6./97%      3.4/3.4/100%      30/32/93%      24/165/13% ?
 | 
|---|
 | 681 | 
 | 
|---|
 | 682 | (d)osf-asc 
 | 
|---|
 | 683 | (e')cool                   15/30/50%       13/13/99%
 | 
|---|
 | 684 | (f)superosf           
 | 
|---|
 | 685 | 
 | 
|---|
| [3259] | 686 | (g)G5-osx-2GHz             14/14.2/98%     6/6.14/99% 
 | 
|---|
| [3251] | 687 | 
 | 
|---|
 | 688 | (h)G4-osx-1.25GHz          26/29.7/87%     15.7/38.6/41%               [-O2 -g]
 | 
|---|
| [3245] | 689 | (i)core-osx-1.83GHz        37/37.8/98%     22/48.4/45%                                 [-g]
 | 
|---|
 | 690 |               -O2          10.5/17.4/55%   11.2/41.2/27%                               [-O2 -g]
 | 
|---|
| [3241] | 691 | (p)ibm-aix-regatta        
 | 
|---|
 | 692 | (q)ibm-aix-meso           5.5/16.8/38%    5.7/13.2/43%      32.7/85/39%    29/60/49%
 | 
|---|
| [3270] | 693 |                           6/11/55%        4/9/44%
 | 
|---|
 | 694 |                           6.3/10.6/60%    4/6/64%
 | 
|---|
 | 695 |    2 lecture //                    ~2x  4/6/60% Elapsed 7 sec
 | 
|---|
 | 696 |    1 write + 2 read //    6.3/10.4/60%  ~2 3.8/6.4/59%  Elapsed < 11 sec                
 | 
|---|
 | 697 |    1 write + 3 read //                                 19 sec
 | 
|---|
 | 698 |    4 read //                                           6 sec
 | 
|---|
| [3241] | 699 | --------------------------------------------------------------------------------------------
 | 
|---|
| [3258] | 700 | 
 | 
|---|
 | 701 | 
 | 
|---|
 | 702 | 
 | 
|---|