| [3158] | 1 | ------------------------------------------------------------------------------------- | 
|---|
|  | 2 | Comparaison performance de differentes machines en compilation / execution (calcul) | 
|---|
|  | 3 | ----------------------- | 
|---|
|  | 4 | Mesures effectuees en Janvier 2007 ,       R. Ansari / C. Magneville | 
|---|
|  | 5 | ------------------------------------------------------------------------------------- | 
|---|
|  | 6 |  | 
|---|
|  | 7 | (a) eros3 : Bipro-bicoeur Xeon@2.4 GHz Linux (xeon-lx-2.4GHz)  , gcc 3.2 | 
|---|
|  | 8 | (b) ccali : Bipro-bicoeur Xeon@2.8 GHz Linux (xeon-lx-2.8GHz)  , icc 8.0 ou 9.0 | 
|---|
| [3244] | 9 | Flags de compilation avec [-O -g] | 
|---|
| [3273] | 10 | (bb) grid49 : Bipro-bicoeur nouveau Xeon 5140 @ 2.33 GHz | 
|---|
|  | 11 | (bb4) grid50 : Bipro-quadri-coeur nouveau Xeon E5345 @ 2.33 GHz --> 8 coeurs | 
|---|
| [3179] | 12 | (c) sgsda: AMD Bipro AMD opteron 248 @ 2.2 GHz (amd-lx- | 
|---|
| [3244] | 13 | Flags de compilation avec [-O -g] | 
|---|
| [3241] | 14 | (cc) grid-saclay: AMD opteron 275 Bipro-bicoeur  @ 2.2 GHz (amd275-lx) | 
|---|
| [3244] | 15 | Flags de compilation avec [-O -g] | 
|---|
| [3158] | 16 |  | 
|---|
|  | 17 | (d) asc: bipro alpha (@ ~1 GHz) server DS20 OSF (osf1)  , cxx 6.5 (osf-asc) | 
|---|
|  | 18 | (nouveau asc 420 MFLOPS, moins puissante que l'ancien asc 800 MFLOPS) | 
|---|
|  | 19 | (e) xp1000-dapnia: alpha xp1000 @ ~ 600 MHz ? OSF1 , cxx ? (osf-xp1000) | 
|---|
| [3241] | 20 | (e') cool: alpha xp1000 @ ~ 667 MHz ? OSF1 5.1 , cxx 6.3 | 
|---|
| [3158] | 21 | (f) superosf-dapnia: multi-proc alphaServer ES80 6 procs EV7 @ 1 GHz (super-osf) | 
|---|
|  | 22 |  | 
|---|
| [3259] | 23 | (g) ccsvx01: XServe G5 bipro @~1.8-2GHz (Darwin/OSX) (G5-osx-2GHz) , gcc 3.3 | 
|---|
| [3158] | 24 | (h) PowerBook-Reza : Apple G4 @ 1.25 GHz (G4-osx-1.25GHz) , gcc 3.3 | 
|---|
|  | 25 | (i) MacBook-Reza: Apple/ Core double-coeur Intel @ 1.83 GHz (core-osx-1.83GHz) gcc 4 | 
|---|
|  | 26 | (j) MacPro-Grosdidier : Apple / Xeon 2 double-coeur @ 3 GHz gcc 4.0.1 , compil SOPHYA -O2 -g | 
|---|
|  | 27 |  | 
|---|
| [3187] | 28 | (p) IBM-AIX regatta , xlC , IBM eServer pSeries 655 , 8 proc power4 @ 1.1 GHz | 
|---|
| [3241] | 29 | (q) IBM-AIX meso , AIX 5.3, xlC V8 , IBM Power5 , 8 proc bi-coeur P575 @ 1.9 GHz | 
|---|
| [3158] | 30 |  | 
|---|
| [3241] | 31 | (s) SGI-IRIX64 magique, CC | 
|---|
| [3158] | 32 |  | 
|---|
|  | 33 | NOTES : | 
|---|
|  | 34 | - Sur les machines Xeon, il y a une interaction entre process / threads par rapport a | 
|---|
|  | 35 | l'occupation des CPU's. On perd un facteur 3 en performance multi-threads/multi-taches. | 
|---|
|  | 36 | La machine MacPro avec OSX se debrouille quand meme mieux. | 
|---|
|  | 37 | - Effet du systeme ou carte mere ??? | 
|---|
|  | 38 |  | 
|---|
| [3244] | 39 | Flag de compilation | 
|---|
|  | 40 | - Flag de compilation par defaut [-O -g] en general | 
|---|
|  | 41 | - Sur eros3 (xeon-linux gcc 3.3) [-O -g] OU [-O3 -g] | 
|---|
|  | 42 | - Sur Darwin [-g] ou [-O2 -g] (ou [-tune G5] sur XServe G5) | 
|---|
|  | 43 | Sur les mac (en particulier G4/G5), grande difference entre -g et -Ox -g | 
|---|
|  | 44 | mais peu de difference entre -O -O2 -O3 | 
|---|
|  | 45 | - Sur machine aix-meso [-O -g] ou [-O3 -g] | 
|---|
| [3187] | 46 |  | 
|---|
| [3251] | 47 | X/ Performances brutes cpupower et donnees SPEC ((http//www.spec.org) | 
|---|
|  | 48 | ---------------------------------------------------------------------- | 
|---|
| [3244] | 49 |  | 
|---|
| [3251] | 50 | (1) MFLOPS  -> cpupower 2   (x/y : -O -g / -O3) | 
|---|
|  | 51 | SPECint2000 (3) / SPECfp2000 (2) (http//www.spec.org) | 
|---|
|  | 52 |  | 
|---|
| [3258] | 53 | X.1/ Performances en calcul double | 
|---|
| [3251] | 54 | csh> cpupower 0 3000000  5 | 
|---|
|  | 55 | 3 10^6 operations doubles - sur memoire 3x3 10^6 doubles (~50 MO) | 
|---|
|  | 56 | ===> ~ 24 MO / MFLOPS | 
|---|
|  | 57 | csh> cpupower 2 | 
|---|
|  | 58 | 1.6 10^9 operations doubles - sur 3x20000 doubles (~0.5 MO) | 
|---|
|  | 59 |  | 
|---|
|  | 60 |  | 
|---|
|  | 61 | Compilation avec -O  (optimisation) | 
|---|
|  | 62 | (1) cpupower 0 : debit memoire en MO/s | 
|---|
|  | 63 | (2) cpupower 0  , MFLOPS | 
|---|
|  | 64 | (5) cpupower 2 ,  MFLOPS | 
|---|
|  | 65 |  | 
|---|
|  | 66 | Compilation avec -g (debug / sans optimisation) | 
|---|
|  | 67 | (3) cpupower 0  , MFLOPS | 
|---|
|  | 68 | (6) cpupower 2  , MFLOPS | 
|---|
|  | 69 |  | 
|---|
|  | 70 | Compilation avec -O3 ou -fast ...( optimisation poussee) | 
|---|
|  | 71 | (4) cpupower 0  , MFLOPS | 
|---|
|  | 72 | (7) cpupower 2  , MFLOPS | 
|---|
|  | 73 |  | 
|---|
|  | 74 |  | 
|---|
|  | 75 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 76 | MFLOPS       |(1) MO/s|   (2)     (3)      (4)   |    (5)       (6)       (7) | 
|---|
|  | 77 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 78 | (a)xeon-lx-2.4GHz    | 1290   |   53      53       55    |    338       340       320 | 
|---|
| [3266] | 79 | (b)xeon-lx-2.8GHzicc | 2040   |   85      80       83    |    914       409       914 | 
|---|
| [3273] | 80 | (bb)n-xeon-grid49    | 3000   |   125     85       130   |    660       500       660 | 
|---|
|  | 81 | (bb)n-xeon4c-grid50  | 2568   |   107     103      109   |    655       500       655 | 
|---|
| [3258] | 82 | (c)amd-lx            | 1560   |   65      77       68    |    666       314       686 | 
|---|
| [3251] | 83 | (cc)amd2-lx          |        | | 
|---|
|  | 84 |  | 
|---|
|  | 85 | (e')osf-cool         |  768   |   32      15       32    |    630       150       660 | 
|---|
|  | 86 | (f)superosf | 
|---|
|  | 87 |  | 
|---|
|  | 88 | (g)G5-osx-1 GHz      | 2100   |   88      68       88    |   1000       255      1073 | 
|---|
|  | 89 | (f)G4-osx-1.25GHz    |  600   |   25      16       25    |    417        93       430 | 
|---|
| [3258] | 90 | (i)core-osx-1.83GHz  | 2500   |  107      75      107    |    855       309       884 | 
|---|
| [3251] | 91 | (j)xeon-osx | 
|---|
|  | 92 |  | 
|---|
| [3266] | 93 | (p)ibm-aix-regatta   | 3100   |  130      55      133    |    730       115      1750  (32 bits) | 
|---|
|  | 94 | (p)ibm-aix-meso      | 5700   |  240      70      320    |   1500       220      3400  (32 bits) | 
|---|
|  | 95 |  | 
|---|
|  | 96 | (s)sgi-magique       |  336   |  14       7       15     |    340        40       460  (32 bits) | 
|---|
| [3251] | 97 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 98 |  | 
|---|
| [3258] | 99 | X.2/  Comparaison performances int, float double | 
|---|
|  | 100 | cpupower compile avec -O | 
|---|
|  | 101 |  | 
|---|
|  | 102 | (1) float , cpupowerF 0 3000000 5 / cpupowerF 2 | 
|---|
|  | 103 | -> MFLOPS (puissance de calcul sur float) | 
|---|
|  | 104 | (2) double, cpupowerD 0 3000000 5 / cpupowerF 2  (idem tableau X.1) | 
|---|
|  | 105 | -> MDBLOPS (puissance de calcul sur float) | 
|---|
|  | 106 | (3) int, cpupowerI 0 3000000 5 / cpupowerI 2 | 
|---|
| [3266] | 107 | -> MINTOPS  (puissance de calcul sur int=4 bytes) | 
|---|
| [3258] | 108 | (4) long (ou long long (*)) cpupowerL 0 3000000 5 / cpupowerL 2 | 
|---|
| [3266] | 109 | -> MLONOPS  (puissance de calcul sur long=8 bytes) | 
|---|
| [3258] | 110 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 111 | MFLOPS       |   (1)MFLOPS       (2)MDBLOPS       (3)MINTOPS       (4)MLONOPS | 
|---|
|  | 112 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 113 | (a)xeon-lx-2.4GHz    | | 
|---|
| [3273] | 114 | (b)xeon-lx-2.8GHzicc |    166/905         90/900           166/1500         88/522    (*) | 
|---|
|  | 115 | (bb)nxeon-grid49     |    250/1030        125/660          250/2500         125/2280 | 
|---|
|  | 116 | (bb4)xeon-4c-grid50  |    207/1019        110/660          207/2460         107/2285 | 
|---|
| [3258] | 117 | (c)amd-lx            |    125/695         65/675           125/1570         65/1045 | 
|---|
|  | 118 | (cc)amd2-lx          | | 
|---|
|  | 119 |  | 
|---|
| [3259] | 120 | (e')osf-cool         |    60/635          32/631            62/640          31/630 | 
|---|
| [3258] | 121 | (f)superosf | 
|---|
|  | 122 |  | 
|---|
| [3259] | 123 | (g)G5-osx-1 GHz      |    180/1260        90/1150           165/940         81/280    (*) | 
|---|
|  | 124 | (f)G4-osx-1.25GHz    |    45/430          25/410            45/710          24/190    (*) | 
|---|
|  | 125 | (i)core-osx-1.83GHz  |    185/919         105/855           187/935         62/246    (*) | 
|---|
| [3258] | 126 | (j)xeon-osx | 
|---|
|  | 127 |  | 
|---|
|  | 128 | (p)ibm-aix-regatta   | | 
|---|
| [3266] | 129 | (p)ibm-aix-meso      |    250/1150        250/1500          250/1200        50/200     (32 bits) | 
|---|
|  | 130 | |    280/1500        250/1600          250/1100        210/1000   (64 bits -q64) | 
|---|
|  | 131 |  | 
|---|
|  | 132 | (s)sgi-magique       | | 
|---|
| [3258] | 133 | ---------------------------------------------------------------------------------------------- | 
|---|
|  | 134 |  | 
|---|
|  | 135 | X.3/  Comparaison avec SPEC | 
|---|
| [3266] | 136 | csh>  cpupower 0 / cpupower 2 | 
|---|
| [3187] | 137 | ---------------------------------------------------------------------- | 
|---|
|  | 138 | MFLOPS(1)      SPECfp      SPECint | 
|---|
|  | 139 | ---------------------------------------------------------------------- | 
|---|
| [3266] | 140 | (b)xeon-lx-2.8GHz         166/900        1400        1400 | 
|---|
|  | 141 | (c)amd-lx                 125/690        1600        1300 | 
|---|
| [3187] | 142 | (cc)amd2-lx               675            1600        1300 | 
|---|
|  | 143 |  | 
|---|
| [3266] | 144 | (e)osf-xp1000             32/650         500         400 | 
|---|
|  | 145 | (f)superosf               842            1100        700 | 
|---|
| [3187] | 146 |  | 
|---|
| [3266] | 147 | (i)core-osx-1.83GHz       110/850        1400        1500 | 
|---|
|  | 148 | (j)xeon-osx               2600           2900          - | 
|---|
| [3187] | 149 |  | 
|---|
| [3266] | 150 | (p)ibm-aix-regatta        130/700        1050        700 | 
|---|
| [3187] | 151 | ---------------------------------------------------------------------- | 
|---|
|  | 152 |  | 
|---|
|  | 153 |  | 
|---|
| [3158] | 154 | A/ Compilation tout SOPHYA | 
|---|
|  | 155 | ---------------------------- | 
|---|
|  | 156 | csh> time make all   (1) | 
|---|
|  | 157 | ou | 
|---|
|  | 158 | csh> time make -j 2 all  (2) | 
|---|
|  | 159 | Temps CPU | 
|---|
|  | 160 | Indice de performance 100*(1000/TCPU) | 
|---|
|  | 161 | Temps elapsed (vrai) | 
|---|
|  | 162 | Temps vrai / TCPU | 
|---|
|  | 163 |  | 
|---|
|  | 164 |  | 
|---|
|  | 165 | ---------------------------------------------------------------------- | 
|---|
|  | 166 | CPU(s)  IndPerf   TElapsed , TCPU/Elapsed % | 
|---|
|  | 167 | ---------------------------------------------------------------------- | 
|---|
|  | 168 | (a)xeon-lx-2.4GHz (2)    615 s      162       410 s        150% | 
|---|
|  | 169 | avec -O3 -g (2)   1300 s       77       760 s        172% | 
|---|
|  | 170 | (b)xeon-lx-2.8GHz (2)    755 s      132       540 s        140% | 
|---|
| [3241] | 171 | (c)amd-lx         (2)    336 s      297       175 s        192% | 
|---|
| [3158] | 172 |  | 
|---|
| [3241] | 173 | (d)osf-asc (1)          1920 s       52      2340 s        83%   (??) | 
|---|
|  | 174 | (e)osf-xp1000 (1)        533 s      187       660 s        80% | 
|---|
|  | 175 | (f)superosf (1)          895 s      112       910 s        98% | 
|---|
| [3158] | 176 |  | 
|---|
| [3259] | 177 | (g)G5-osx-2GHz (2)       453 s      221       250 s        182% | 
|---|
| [3251] | 178 | -tune=G5            1100 s       90 | 
|---|
|  | 179 | -g -O                740 s                380 s        195% | 
|---|
|  | 180 | (h)G4-osx-1.25GHz (1)    660 s      151       710 s        93%   [-g] | 
|---|
| [3244] | 181 | 1500 s                             94%   [-O2 -g] | 
|---|
| [3158] | 182 | (i)core-osx-1.83GHz (2)  209 s      478       116 s        180% | 
|---|
| [3159] | 183 | -O2   (1)  367 s      272       381          96% | 
|---|
| [3158] | 184 | (j)xeon-osx | 
|---|
|  | 185 |  | 
|---|
|  | 186 | (p)ibm-aix | 
|---|
|  | 187 | ---------------------------------------------------------------------- | 
|---|
|  | 188 |  | 
|---|
|  | 189 | Taille shared libs : | 
|---|
| [3251] | 190 | (a) | 
|---|
|  | 191 | (c) 33 MO | 
|---|
| [3158] | 192 | (f) = (e) = 57 MO | 
|---|
| [3251] | 193 | (g) 80 MO | 
|---|
| [3158] | 194 | (i) 83 MO | 
|---|
|  | 195 |  | 
|---|
| [3187] | 196 | B/ Calcul brut (Tableaux de SOPHYA) avec / sans threads | 
|---|
|  | 197 | -------------------------------------------------------- | 
|---|
| [3266] | 198 | B.1.a/   Calcul sur vecteur 10 * V2 ~= DLO4 (V1) | 
|---|
|  | 199 | ~ 10 x 10 x 9. 10^6 operations double sur 2 x  9 10^6 double | 
|---|
|  | 200 | 900 M.Ops r_8 / ~ 1500 MO | 
|---|
| [3158] | 201 |  | 
|---|
| [3266] | 202 | (1) time cpupower 0     # compile avec -O  (/ -O -g) | 
|---|
|  | 203 | (2) time zthr arrdl 1 3000   1 thread | 
|---|
|  | 204 | (3) time zthr arrdl 2 3000   2 thread | 
|---|
|  | 205 | (4) time zthr arrdl 4 3000   4 thread | 
|---|
|  | 206 | (5) time zthr arrdl 6 3000   6 thread | 
|---|
|  | 207 | (6) time zthr arrdl 8 3000   8 thread | 
|---|
|  | 208 |  | 
|---|
|  | 209 | ----------------------------------------------------------------------------------- | 
|---|
|  | 210 | (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/% | 
|---|
|  | 211 | ----------------------------------------------------------------------------------- | 
|---|
|  | 212 | (a)xeon-lx-2.4GHz      53 | 
|---|
| [3273] | 213 | (b)xeon-lx-2.8GHz      85       2.6/2.6/100%    5.3/2.9/180%   14.3/4.86/310% | 
|---|
| [3266] | 214 | (5) 23/7.4/314% | 
|---|
| [3273] | 215 | (bb)nxeon-grid49       125      2/2/99%         4/2/186%       8/2.45/326% | 
|---|
|  | 216 | (5) 12/3.6/330%   (6) 16/4.7/340% | 
|---|
|  | 217 | (bb4)xeon-4c-grid50    110      2.2/2.2/99%     4.2/2.3/185%   8/2.5/321% | 
|---|
|  | 218 | (5) 13/3.1/470%   (6) 21/3.8/544% | 
|---|
| [3266] | 219 | (c)amd-lx              95 | 
|---|
|  | 220 |  | 
|---|
|  | 221 |  | 
|---|
|  | 222 | (e')osf-cool           32       5.7/5.8/98%     11.1/11.3/98%   22.3/22.5/98% | 
|---|
|  | 223 | (f)superosf | 
|---|
|  | 224 |  | 
|---|
|  | 225 | (g)G5-osx-2GHz         88       2.5/2.6/99%     5.9/3.38/184%    11/6.45/173%    [-O2 -g] | 
|---|
|  | 226 | (h)G4-osx-1.25GHz      25       6.6/7/95%       13.4/13.8/97%                    [-O2 -g] | 
|---|
|  | 227 | (i)core-osx-1.83GHz   107       2.1/2.1/98%     4.3/2.9/150%     8.3/30/31%      [-O2 -g] | 
|---|
|  | 228 | (j)xeon-osx | 
|---|
|  | 229 |  | 
|---|
|  | 230 | (p)ibm-aix-regatta    130 | 
|---|
| [3270] | 231 | (q)ibm-aix-meso       150       0.7/1/70%       1.2/2./60%       3.2/2/150%   [-O3] | 
|---|
|  | 232 | (5) 5.4/3/180%    (6) 6.4/3/210% | 
|---|
| [3266] | 233 |  | 
|---|
|  | 234 | (s)sgi-magique          7       78/78/99%       167/95/175%      339/96/352%     [-O -g: NON-OPT] | 
|---|
|  | 235 | 14       16.4/16.5/99%   33.8/22.4/150%   79/32/250%      [-O -g2 OPT] | 
|---|
|  | 236 | ----------------------------------------------------------------------------------- | 
|---|
|  | 237 |  | 
|---|
|  | 238 | B.1.b/   Calcul sur vecteur V2 = Sin(V1) + Cos(V1) | 
|---|
|  | 239 | ~ 50 x 9. 10^6 operations double sur 2 x  9 10^6 double, mem ~ 150 MO | 
|---|
|  | 240 | ~500 M.Ops r_8 / ~ 600 MO I/O | 
|---|
|  | 241 |  | 
|---|
|  | 242 | (1) time cpupower 0     # compile avec -O  (/ -O -g) | 
|---|
|  | 243 | (2) time zthr arrmf 1 3000   1 thread | 
|---|
|  | 244 | (3) time zthr arrmf 2 3000   2 thread | 
|---|
|  | 245 | (4) time zthr arrmf 4 3000   4 thread | 
|---|
|  | 246 | (5) time zthr arrmf 6 3000   6 thread | 
|---|
|  | 247 | (6) time zthr arrmf 8 3000   8 thread | 
|---|
|  | 248 |  | 
|---|
|  | 249 | ----------------------------------------------------------------------------------- | 
|---|
|  | 250 | (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/% | 
|---|
|  | 251 | ----------------------------------------------------------------------------------- | 
|---|
|  | 252 | (a)xeon-lx-2.4GHz      53 | 
|---|
| [3273] | 253 | (b)xeon-lx-2.8GHz      85       1.7/1.7/100%    3.5/2.1/173%     9.8/3.6/275% | 
|---|
|  | 254 | (5) 12/3.6/330%    (6) 16/4.7/340% | 
|---|
|  | 255 | (bb)nxeon-grid49       125      1.6/1.6/100%    3.2/1.7/183%     6.7/2.1/314% | 
|---|
|  | 256 | (5) 10.1/3.2/320%  (6) 14.4/4.05/330% | 
|---|
| [3266] | 257 | (c)amd-lx              95 | 
|---|
|  | 258 |  | 
|---|
|  | 259 | (e')osf-cool           32       4.2/4.3/98%     8.2/8.4/98%      16.1/16.2/98% | 
|---|
|  | 260 | (f)superosf | 
|---|
|  | 261 |  | 
|---|
|  | 262 | (g)G5-osx-2GHz         88       2.3/2.3/100%      5/3/165%        9.6/5.8/167%  [-O2 -g] | 
|---|
|  | 263 | (h)G4-osx-1.25GHz      25       4.5/4.8/95%       10.9/14.6/72%        [-O2 -g] | 
|---|
|  | 264 | (i)core-osx-1.83GHz   107       2.3/2.3/98%       4.8/3.1/158%                [-O2 -g] | 
|---|
|  | 265 | (j)xeon-osx | 
|---|
|  | 266 |  | 
|---|
|  | 267 | (p)ibm-aix-regatta    130 | 
|---|
| [3270] | 268 | (q)ibm-aix-meso       150       1./2/50%         2.8/3/86%       5.4/4/130%   [-O3] | 
|---|
|  | 269 | (5) 10/4/250%    (6) 11.2/5/220%% | 
|---|
| [3266] | 270 |  | 
|---|
|  | 271 | (s)sgi-magique         7       11.5/11.7/99%     24/17/140%      51.5/18.4/280%  [-O -g NON-OPT] | 
|---|
|  | 272 | 14       6.5/6.6/99%       13.3/12/110%    34.5/17.3/200%  [-O -g3 OPT] | 
|---|
|  | 273 | ----------------------------------------------------------------------------------- | 
|---|
|  | 274 |  | 
|---|
|  | 275 |  | 
|---|
|  | 276 | B.1.c/ Version corrige de zthr.cc (apres 23/05/07) | 
|---|
| [3254] | 277 | arr = (c1*a1) + (c2*a2) | 
|---|
|  | 278 | ~ 3 x 4. 10^6 operations int_4 sur 3 x 4 10^6 int_4 | 
|---|
|  | 279 | 12 M.Ops int_4 / ~ 50 MO | 
|---|
|  | 280 |  | 
|---|
|  | 281 | (1) time cpupower 0     # compile avec -O  (/ -O -g) | 
|---|
|  | 282 | (2) time zthr arr 1 2000   1 thread | 
|---|
|  | 283 | (3) time zthr arr 2 2000   2 thread | 
|---|
|  | 284 | (4) time zthr arr 4 2000   4 thread | 
|---|
|  | 285 | (5) time zthr arr 6 2000   6 thread | 
|---|
|  | 286 | (6) time zthr arr 8 2000   8 thread | 
|---|
|  | 287 |  | 
|---|
|  | 288 | ----------------------------------------------------------------------------------- | 
|---|
|  | 289 | (1)MFLOPS  (2)CPU/Elap/%   (3)CPU/Elap/%   (4)CPU/Elap/% | 
|---|
|  | 290 | ----------------------------------------------------------------------------------- | 
|---|
|  | 291 | (a)xeon-lx-2.4GHz      53        0.5/1/43%      1/1.1/88%      2.8/1/262% | 
|---|
|  | 292 | (5) 4.5/1.8/246%      (6) 6.1/2.1/310% | 
|---|
|  | 293 |  | 
|---|
|  | 294 |  | 
|---|
|  | 295 | (b)xeon-lx-2.8GHz      65 | 
|---|
|  | 296 |  | 
|---|
|  | 297 | (c)amd-lx              95        0.23/1/22%     0.44/1/51%       1/1/102%     [-O -g] | 
|---|
|  | 298 | (5) 1.6/1/106%   (6) 2.2/1.2/100% | 
|---|
|  | 299 |  | 
|---|
|  | 300 |  | 
|---|
|  | 301 |  | 
|---|
|  | 302 | (e')osf-cool           32        0.43/1.2/35%   0.6/1.33/44%     1.1/1.3/82%      [-O -g] | 
|---|
|  | 303 | (5) 1.45/1.7/85%   (6) 1.83/2.16/84% | 
|---|
|  | 304 | (f)superosf | 
|---|
|  | 305 |  | 
|---|
| [3259] | 306 | (g)G5-osx-2GHz         88       1.5/1.5/100%    3.2/1.7/185%      6.6/3.5/188%    [-O -g] | 
|---|
|  | 307 | (g)G5-osx-2GHz         88       0.4/1/40%       0.9/1.0/90%       2/1.2/169%      [-tune=G5 -g] | 
|---|
| [3254] | 308 | (5) 3.3/2/164%    (6) 4.3/2.6/165% | 
|---|
|  | 309 | (h)G4-osx-1.25GHz      25       3/3/95%                                           [-O2 -g] | 
|---|
|  | 310 |  | 
|---|
|  | 311 | (i)core-osx-1.83GHz               [-O2 -g] | 
|---|
|  | 312 |  | 
|---|
|  | 313 | (j)xeon-osx | 
|---|
|  | 314 |  | 
|---|
|  | 315 |  | 
|---|
|  | 316 | (p)ibm-aix-regatta   130 | 
|---|
|  | 317 |  | 
|---|
|  | 318 | (q)ibm-aix-meso      150        0.6/1/58%       1/1/91%           1.7/1.2/132%    [-O3] | 
|---|
|  | 319 | (5) 2.4/1.2/193%   (6) 4.25/1.6/265% | 
|---|
|  | 320 |  | 
|---|
|  | 321 | ----------------------------------------------------------------------------------- | 
|---|
|  | 322 |  | 
|---|
| [3266] | 323 | B.1.x/ ancienne version de zthr (avant 23/05/07) | 
|---|
| [3254] | 324 | On faisait 2 multiplications par ctye suivi d'un produit matriciel ! | 
|---|
|  | 325 | arr = c1*a1*c2*a2   ( ~ 3 10^6 op. double) | 
|---|
| [3241] | 326 | (1) time cpupower 2     # compile avec -O3  (/ -O -g) | 
|---|
| [3158] | 327 | (2) time zthr arr 1 1000   1 thread | 
|---|
|  | 328 | (3) time zthr arr 2 1000   2 thread | 
|---|
|  | 329 | (4) time zthr arr 4 1000   4 thread | 
|---|
|  | 330 | (5) time zthr arr 6 1000   6 thread | 
|---|
|  | 331 | (6) time zthr arr 8 1000   8 thread | 
|---|
|  | 332 |  | 
|---|
|  | 333 | ----------------------------------------------------------------------------------- | 
|---|
|  | 334 | (1)MFLOPS  (2)CPU/Elap/% (2)IndPerf  (3)CPU/Elap/% (3)IndPerf | 
|---|
|  | 335 | ----------------------------------------------------------------------------------- | 
|---|
|  | 336 | (a)xeon-lx-2.4GHz     1167        5.15/5.2/99%              11.4/5.8/196% | 
|---|
|  | 337 | (4)36.6/9.28/394% | 
|---|
|  | 338 | -O3 -g                    4.9/5./99% | 
|---|
|  | 339 | (b)xeon-lx-2.8GHz      920        2.3/2.3/100%              6.2/3.1/198% | 
|---|
|  | 340 | (4)26/6.6/396% | 
|---|
|  | 341 | (c)amd-lx              690        3.6/3.6/99%               6.8/4/171% | 
|---|
|  | 342 | (4)13.5/7/193% | 
|---|
|  | 343 | (5)20.3/10.23/198% | 
|---|
| [3187] | 344 | (cc)amd2-lx            675        2/2/99%                   4.15/2.1/197% | 
|---|
|  | 345 | (4)8.25/4.15/198% | 
|---|
|  | 346 | (5)13.6/4.6/292% | 
|---|
|  | 347 | (6)19.8/6.5/300% | 
|---|
| [3158] | 348 |  | 
|---|
| [3241] | 349 | (d)osf-asc             420        6.3s/6.5s/99%            16.9/8.8/192% | 
|---|
|  | 350 | (4)29.9/15.7/191% | 
|---|
|  | 351 | (e)osf-xp1000          648        5.1/5.3/96.6%            11.4/11.4/99% | 
|---|
| [3158] | 352 | (4)25.2/25.5/99% | 
|---|
| [3241] | 353 | (f)superosf            842        2.87/2.88/99.6%          6.25/4.1/153% | 
|---|
| [3158] | 354 | (4)11.6/3.06/379% | 
|---|
|  | 355 |  | 
|---|
| [3251] | 356 | (h)G4-osx-1.25GHz       92        44s/48s/91%              86.7/99.8/92%  [-g] | 
|---|
| [3244] | 357 | 380        12.2/12.9/95%            24/25.3/95%    [-O2 -g] | 
|---|
| [3259] | 358 | (g)G5-osx-2GHz        1151        20s/20s/99%              40s/23s/170% | 
|---|
| [3158] | 359 | (4) 80.8/45/180% | 
|---|
| [3251] | 360 | -O -g                          4.5/4.9/91%              9.3/4.7/197% | 
|---|
|  | 361 | (4) 18.3/9.4/197% | 
|---|
| [3159] | 362 | -tune=G5                       3.35/3.8/88%             7.1/3.6/196% | 
|---|
| [3251] | 363 | (h)G4-osx-1.25GHz       92        44s/48s/91%              86.7/99.8/92%  [-g] | 
|---|
|  | 364 | 380        12.2/12.9/95%            24/25.3/95%    [-O2 -g] | 
|---|
| [3159] | 365 | (4) 14/7.5/187% | 
|---|
| [3244] | 366 | (i)core-osx-1.83GHz    855        11.5/11.5/100%           23/11.6/192%   [-g] | 
|---|
| [3158] | 367 | (4) 46/23/199% | 
|---|
| [3244] | 368 | -O2                 3.85/3.89/99%            7.7/3.9/198%   [-O2 -g] | 
|---|
| [3159] | 369 | (4) 15.4/7.77/198% | 
|---|
|  | 370 |  | 
|---|
| [3158] | 371 | (j)xeon-osx           2600        2.5/2.5/100%             5.1/2.6/199% | 
|---|
|  | 372 | (4) 11.5/3.2/362% | 
|---|
|  | 373 | (5) 17.4/4.77/365% | 
|---|
|  | 374 |  | 
|---|
| [3241] | 375 | (p)ibm-aix-regatta  1750/730      6.8/6.9/98%              13.1/6.75/195% | 
|---|
| [3158] | 376 | (4) 26.3/11.7/225% | 
|---|
| [3241] | 377 | (q)ibm-aix-meso     3600/1250     3.6/3.75/96%             7.35/3.7/197% | 
|---|
|  | 378 | (4) 12.46/4.2/298% | 
|---|
|  | 379 | (5) 219/6.7/280% | 
|---|
|  | 380 | (6) 24/4.5/530% | 
|---|
| [3158] | 381 |  | 
|---|
| [3241] | 382 |  | 
|---|
|  | 383 | (s)sgi-magique         460        60/60/99% | 
|---|
| [3158] | 384 | ----------------------------------------------------------------------------------- | 
|---|
|  | 385 |  | 
|---|
|  | 386 |  | 
|---|
|  | 387 | B.2/ Multiplication de matrices mtx = mtx1 * mtx2 | 
|---|
| [3244] | 388 | ~ 2  10^9 op. double / thread | 
|---|
| [3241] | 389 | (1) time cpupower 2  (-O3 / -O -g) | 
|---|
| [3158] | 390 | (2) time zthr mtx 1 1000   1 thread | 
|---|
|  | 391 | (3) time zthr mtx 2 1000   2 thread | 
|---|
|  | 392 | (4) time zthr mtx 4 1000   4 thread | 
|---|
|  | 393 | (5) time zthr mtx 6 1000   6 thread | 
|---|
|  | 394 | (6) time zthr mtx 8 1000   8 thread | 
|---|
|  | 395 |  | 
|---|
|  | 396 | ----------------------------------------------------------------------------------- | 
|---|
|  | 397 | (1)MFLOPS  (2)CPU/Elap/% (2)IndPerf  (3)CPU/Elap/% (3)IndPerf | 
|---|
|  | 398 | ----------------------------------------------------------------------------------- | 
|---|
|  | 399 | (a)xeon-lx-2.4GHz     1167        6.5/6.5/100%               17.4/8.8/198% | 
|---|
|  | 400 | (4) 80.5/20.3/397% | 
|---|
|  | 401 | (5) 114.5/29.6/387% | 
|---|
|  | 402 | (6) 160/40.3/387% | 
|---|
|  | 403 |  | 
|---|
|  | 404 | (b)xeon-lx-2.8GHz      920        3.4/3.4/100%               12/6.1/199% | 
|---|
|  | 405 | (4) 55.8/14/400% | 
|---|
| [3270] | 406 | (5) 79.5/20.3/392% | 
|---|
|  | 407 | (6) 102/25.8/396% | 
|---|
| [3273] | 408 | (bb)nxeon-grid49       660        4.3/4.3/100%               9.3/4.7/200% | 
|---|
|  | 409 | (4) 28.8/7.3/390% | 
|---|
|  | 410 | (5) 41/10.4/393% | 
|---|
|  | 411 | (6) 57.5/14.45/397% | 
|---|
|  | 412 | (bb4)xeon-4c-grid50    660        4.7/4.7/100%               10.7/5.15/199% | 
|---|
|  | 413 | (4) 27.8/7.1/391% | 
|---|
|  | 414 | (5) 57.4/10.6/540% | 
|---|
|  | 415 | (6) 129/16.8/776% | 
|---|
| [3158] | 416 | (c)amd-lx              690        6.98/6.98/100%             14.1/8.15/173% | 
|---|
|  | 417 | (4) 27.7/14.23/194% | 
|---|
|  | 418 | (5) 41.4/21.07/196% | 
|---|
|  | 419 | (6) 55.4/27.9/198.7% | 
|---|
| [3187] | 420 | (cc)amd2-lx            675        4.1/4.1/100%               9.55/4.8/198% | 
|---|
|  | 421 | (4) 20/10.27/195% | 
|---|
|  | 422 | (5) 32.8/11.16/294% | 
|---|
|  | 423 | (6) 42.75/13.8/309% | 
|---|
| [3158] | 424 |  | 
|---|
| [3187] | 425 |  | 
|---|
| [3241] | 426 | (d)osf-asc             420        13.5s/13.7s/98%            32/16.5/194% | 
|---|
|  | 427 | (4) 67.5/34.4/196% | 
|---|
|  | 428 | (e)osf-xp1000          648        13/14.1/92%                27.1/27.4/99% | 
|---|
| [3158] | 429 | (4) 54/54.7/99.6% | 
|---|
|  | 430 | (5) 80.6/81/99.6% | 
|---|
|  | 431 | (6) 107.8/108.3/99.5% | 
|---|
| [3270] | 432 | (e')osf-cool                      13/13.22/98%               26/26.1/99% | 
|---|
|  | 433 | (4) 51.8/51.9/99% | 
|---|
| [3241] | 434 | (f)superosf            842        6.1/7.24/84%               12.35/6.29/196% | 
|---|
| [3158] | 435 | (4) 24.3/6.31/385% | 
|---|
|  | 436 | (5) 36.5/10.9/335% | 
|---|
|  | 437 | (6) 50.1/18.15/276% | 
|---|
|  | 438 |  | 
|---|
| [3259] | 439 | (g)G5-osx-2GHz        1151        23/23.7/97%                46.5/27.5/170% | 
|---|
| [3158] | 440 | (4) 93.4/49.4/189% | 
|---|
| [3251] | 441 | -O -g                           6.2/6.2/100%                14.2/7.2/197% | 
|---|
|  | 442 | (4) 28.3/14.36/197% | 
|---|
| [3159] | 443 | -tune=G5                        5.7/5.8/98%                13.3/6.8/197% | 
|---|
|  | 444 | (4) 26.8/13.56/197% | 
|---|
|  | 445 | (6) 53.8/27.25/197% | 
|---|
| [3251] | 446 | (h)G4-osx-1.25GHz      333        23.5/24.5/96%                              [-O2] | 
|---|
| [3158] | 447 | (i)core-osx-1.83GHz    855        12.6/12.7/100%             25.8/13.4/194% | 
|---|
|  | 448 | (4) 51.6/26/199% | 
|---|
| [3159] | 449 | -O2                   4.25/4.5/94%               10.6/5.36/198% | 
|---|
|  | 450 | (4) 20.87/10.68/198% | 
|---|
|  | 451 | -O2 2 jobs //           2 x 5/5.4/92% | 
|---|
| [3158] | 452 | (j)xeon-osx           2600        2.8/2.8/99%                9.3/4.66/199% | 
|---|
|  | 453 | (4) 31.4/8.6/364% | 
|---|
|  | 454 | (5) 47.1/12.96/364% | 
|---|
|  | 455 | (6) 62.8/17.38/362% | 
|---|
|  | 456 |  | 
|---|
| [3241] | 457 | (p)ibm-aix-regatta  1750/730      9.5/9.7/98%                18.3/16.0/114% | 
|---|
| [3158] | 458 | (4) 38.3/24.7/155% | 
|---|
| [3241] | 459 | (p)ibm-aix-meso     3600/1250     2.3/2.3/99%                5.1/2.64/194%   (compil avec -O3) | 
|---|
|  | 460 | (4) 11.4/4.16/272% | 
|---|
|  | 461 | (5) 20.2/5.85/344% | 
|---|
|  | 462 | (6) 29.9/6.74/442% | 
|---|
| [3179] | 463 |  | 
|---|
| [3266] | 464 | (s)sgi-magique         400        44/44.3/99%                96.5/55/176% | 
|---|
| [3179] | 465 |  | 
|---|
| [3158] | 466 | ----------------------------------------------------------------------------------- | 
|---|
|  | 467 |  | 
|---|
| [3254] | 468 |  | 
|---|
|  | 469 | B.4/ Operations sur tableaux doubles- mesures avec spar | 
|---|
|  | 470 | csh> time spar 2 1 2000 2000 | 
|---|
|  | 471 | (1) cpupower 2  MFLOPS | 
|---|
|  | 472 | (2) MFLOPS (double) spar | 
|---|
|  | 473 | (3) time spar 2 5 1000 2000 CPU/Elap/% | 
|---|
|  | 474 | ----------------------------------------------------------------------------------- | 
|---|
|  | 475 | (1)MFLOPS      (2)CPU / %         (3)CPU/Elap/% | 
|---|
|  | 476 | ----------------------------------------------------------------------------------- | 
|---|
|  | 477 | (a)xeon-lx-2.4GHz      53       ~ 20-35 MFLOPS , 90%     20/20.2/99%       [-g -O] | 
|---|
|  | 478 |  | 
|---|
|  | 479 |  | 
|---|
|  | 480 | (b)xeon-lx-2.8GHz      65 | 
|---|
|  | 481 |  | 
|---|
|  | 482 | (c)amd-lx              95       ~ 20-40 MFLOPS , 99%     17.2/17.2/100%    [-g -O] | 
|---|
|  | 483 |  | 
|---|
|  | 484 |  | 
|---|
|  | 485 | (d)osf-asc | 
|---|
|  | 486 |  | 
|---|
|  | 487 | (e)osf-xp1000          32       ~ 15-25 MFLOPS , 90%     37.6/41.2/91%      [-g -O] | 
|---|
|  | 488 | (f)superosf | 
|---|
|  | 489 |  | 
|---|
| [3259] | 490 | (g)G5-osx-2GHz         88       ~ 10-25 MFLOPS , 99%     45/45/100%         [-g -O] ou [-g -O2] | 
|---|
| [3254] | 491 | (h)G4-osx-1.25GHz      25       ~ 8-16  MFLOPS , 92%     45.5/52/90%        [-g -O2] | 
|---|
|  | 492 |  | 
|---|
|  | 493 | (i)core-osx-1.83GHz | 
|---|
|  | 494 |  | 
|---|
|  | 495 | (j)xeon-osx | 
|---|
|  | 496 |  | 
|---|
|  | 497 |  | 
|---|
|  | 498 | (p)ibm-aix-regatta   130 | 
|---|
|  | 499 |  | 
|---|
|  | 500 | (q)ibm-aix-meso      150        ~ 80-100 MFLOPS , 90%   5./23/22%     [-O3] | 
|---|
|  | 501 |  | 
|---|
|  | 502 |  | 
|---|
|  | 503 |  | 
|---|
|  | 504 | (s)sgi-magique         460 | 
|---|
|  | 505 | ----------------------------------------------------------------------------------- | 
|---|
|  | 506 |  | 
|---|
| [3258] | 507 | B.5/  Calcul/comparaison avec JET/tjet | 
|---|
|  | 508 | csh> time tjet 10 2000 2000   OU tjet 10 2000 1000 | 
|---|
|  | 509 | (1) TCPU EltAccess C/pointeurs | 
|---|
|  | 510 | (2) TCPU m1*c1+m2*c2+m3*c3 C/pointeurs | 
|---|
|  | 511 | (3) TCPU EltAccess  SOPHYA | 
|---|
| [3259] | 512 | (4) TCPU m1*c1+m2*c2+m3*c3 SOPHYA  / Methodes (MultCst, AddArr ...) | 
|---|
| [3258] | 513 | (5) TCPU m1*c1+m2*c2+m3*c3 SOPHYA JET | 
|---|
| [3254] | 514 |  | 
|---|
| [3258] | 515 | ----------------------------------------------------------------------------------- | 
|---|
|  | 516 | (1)        (2)         (3)         (4)           (5) | 
|---|
|  | 517 | ----------------------------------------------------------------------------------- | 
|---|
|  | 518 | (b)xeon-lx-2.8GHz-icc       0.87       0.63        1.55      2.7/1.6         0.57 | 
|---|
|  | 519 | (c)amd-lx                   0.94       0.79        1.85      3.4/2.1         0.76 | 
|---|
|  | 520 |  | 
|---|
| [3259] | 521 | (e')osf-cool                2.85       2.45        3.1       6.5/5.5         4.1 | 
|---|
|  | 522 |  | 
|---|
|  | 523 | (g)G5-osx-2GHz              1.5        0.61        2.1       4.1/1.6         0.6   (-g -O2) | 
|---|
|  | 524 | -tune=G5 -fast          1.1        0.62        1.3       4.1/1.6         0.58 | 
|---|
|  | 525 | (h)G4-osx-1.25GHz           3.86       2.2         5         9.4/6.2         3     (-g -O2) | 
|---|
| [3258] | 526 | (i)core-osx-1.83GHz         1.1        0.49        1.6       2.8/1.7         0.68 | 
|---|
|  | 527 |  | 
|---|
|  | 528 | (q)ibm-aix-meso             0.43       0.27        0.52      1.12/0.75       0.35 | 
|---|
| [3266] | 529 |  | 
|---|
|  | 530 | (s)sgi-magique              2.45       1.9         5.65      7.45/6.3        2.8  (-O -g3) | 
|---|
| [3258] | 531 | ----------------------------------------------------------------------------------- | 
|---|
|  | 532 |  | 
|---|
|  | 533 |  | 
|---|
| [3187] | 534 | C/ Calcul fft (FFTW , FFTPack ) | 
|---|
|  | 535 | ------------------------------- | 
|---|
| [3158] | 536 |  | 
|---|
|  | 537 | (1) time cpupower 2 | 
|---|
|  | 538 | (2) time tfft 2000000 W d 0 0 (avec FFTW) | 
|---|
|  | 539 | (3) time tfft 2000000 P d 0 0 (avec FFTPack_Sophya) | 
|---|
|  | 540 |  | 
|---|
|  | 541 | IndPerf=1000/TCPU | 
|---|
|  | 542 |  | 
|---|
|  | 543 |  | 
|---|
|  | 544 | ----------------------------------------------------------------------------------- | 
|---|
|  | 545 | (1)MFLOPS     (2)CPU/Elap/%   (2)IndPerf   (3)CPU/Elap/%  (3)IndPerf | 
|---|
|  | 546 | ----------------------------------------------------------------------------------- | 
|---|
|  | 547 | (a)xeon-lx-2.4GHz     1167        5.5/5.6/97%        180         7.4/7.4/100%     135 | 
|---|
|  | 548 | -O3 -g                                                    6.8/7.1/96%      147 | 
|---|
| [3270] | 549 |  | 
|---|
|  | 550 | (b)xeon-lx-2.8GHz      920        3.6/3.7/98%                    3.7/3.8/99% | 
|---|
|  | 551 | ~2x    6.5/8.8/73% | 
|---|
|  | 552 | ~4x    8/14/55%   (15 sec elapsed) | 
|---|
| [3273] | 553 | (bb)nxeon-grid49       660                                       3.3/3.3/99% | 
|---|
|  | 554 | ~2x    3.3/3.4/100%  (4 sec elapsed) | 
|---|
|  | 555 | ~4x    4.6/4.6/99%   (5 sec elapsed) | 
|---|
|  | 556 | ~8x    4.5/4.5/50%   (9 sec elapsed) | 
|---|
|  | 557 | (bb4)xeon-4c-grid50    660                                       3.6/3.6/99% | 
|---|
|  | 558 | ~2x    3.6/3.6/100%  (4 sec elapsed) | 
|---|
|  | 559 | ~4x    4.4/4.4/99%   (5 sec elapsed) | 
|---|
|  | 560 | ~8x    7.1/7.1/99%   (7  sec elapsed) | 
|---|
|  | 561 | ~16x   7.1/14.4/50%   (15 sec elapsed) | 
|---|
| [3270] | 562 |  | 
|---|
| [3241] | 563 | (c)amd-lx              690        3.2/3.75/86%                   4.2/4.2/100%     238 | 
|---|
| [3159] | 564 | ~2x  4.7/4.7/99% | 
|---|
| [3187] | 565 | (cc)amd2-lx            675        2.8/2.8/99%                    3.56/3.58/99% | 
|---|
| [3158] | 566 |  | 
|---|
| [3241] | 567 | (d)osf-asc             420        13.3/13.9/95.7%     75        12.2/17.6/70%     82 | 
|---|
|  | 568 | (e)osf-xp1000          648        9.9/10.2/97%       101        9.3/9.46/98.5%    107 | 
|---|
| [3270] | 569 | (e')cool                          9.2/10.1/91%                  9.1/9.3/98% | 
|---|
| [3241] | 570 | (f)superosf            842        6./6.22/96.7%      166        5.1/5.18/97.4%    196 | 
|---|
| [3158] | 571 |  | 
|---|
| [3259] | 572 | (g)G5-osx-2GHz        1151        8.5/8.7/97.5%      120        11.5/11.6/99%     87 | 
|---|
| [3251] | 573 | -O -g                         5.1/5.2/98%        190        5.4/5.5/99%       180 | 
|---|
|  | 574 | -tune=G5                      5/5.1/98%          200        5.5/5.6/97%       180 | 
|---|
|  | 575 | (h)G4-osx-1.25GHz       92        15.2/15.9/96%       66        23.8/34.1/70%     42    [-g] | 
|---|
| [3244] | 576 | 380        8.8/10.3/86%                  14.7/20.9/65%           [-O2 -g] | 
|---|
| [3158] | 577 | (i)core-osx-1.83GHz    855        4.6/4.75/97%       217        7/7.08/99%        142 | 
|---|
| [3159] | 578 | -O2                      3.2/3.2/99%        312        3.6/3.6/99%       277 | 
|---|
|  | 579 | -O2 2 jobs //              2 x 3.9/4.3/90%                 2x  4.7/5/91%       200 | 
|---|
| [3158] | 580 | (j)xeon-osx           2600                                      2.6/2.6/98%       384 | 
|---|
| [3159] | 581 | 2 jobs //                                           ~2x  3.6/4.1/87%       250 | 
|---|
|  | 582 | 4 jobs // | 
|---|
| [3158] | 583 |  | 
|---|
| [3241] | 584 | (p)ibm-aix-regatta 1750/730       6.25/18.9/33%      160        5.25/15.7/33%     190 | 
|---|
|  | 585 | (q)ibm-aix-meso    3600/1250      3.95/4.3/91%       250        3.82/4./94%       260 | 
|---|
|  | 586 | 2 jobs //                                           ~2x  3.88/4.2/92%      250 | 
|---|
| [3270] | 587 | 2.8/4.76/59%                  2.1/4.45/47%     [-O3] | 
|---|
|  | 588 | ~2x   2.5/4.7/54% | 
|---|
|  | 589 | ~4x   2.5/4.8/55% | 
|---|
| [3158] | 590 |  | 
|---|
| [3241] | 591 |  | 
|---|
|  | 592 | (s)sgi-magique         460        22/22/98%           42         24.5/25/99%       40 | 
|---|
| [3158] | 593 | ----------------------------------------------------------------------------------- | 
|---|
| [3187] | 594 |  | 
|---|
|  | 595 |  | 
|---|
| [3241] | 596 | D/ Calcul inversion par lapack | 
|---|
|  | 597 | ------------------------------- | 
|---|
| [3187] | 598 |  | 
|---|
| [3241] | 599 | lpk inverse 1000,1000 0 | 
|---|
|  | 600 | ---> temps de calcul inversion par lapack | 
|---|
|  | 601 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 602 | CPU/Elap/%       (1) | 
|---|
|  | 603 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 604 | (a)xeon-lx-2.4GHz          5.6/~100% | 
|---|
| [3270] | 605 | (b)xeon-lx-2.8GHz          5.34/~90% | 
|---|
| [3241] | 606 | (c)amd-lx                  5.5/5.5/99% | 
|---|
| [3187] | 607 |  | 
|---|
| [3241] | 608 | (d)osf-asc | 
|---|
|  | 609 | (e')cool                   2.8/2.9/95% | 
|---|
|  | 610 | (f)superosf | 
|---|
|  | 611 |  | 
|---|
| [3244] | 612 | (f)G4-osx-1.25GHz          2.3/~100%   [-O2 -g] | 
|---|
| [3259] | 613 | (h)G5-osx-2GHz             0.8/~100% | 
|---|
| [3251] | 614 | -O -g                  0.86/~100% | 
|---|
| [3245] | 615 | (i)core-osx-1.83GHz        1.93/~100% | 
|---|
| [3241] | 616 | -O2 | 
|---|
|  | 617 | (p)ibm-aix-regatta | 
|---|
|  | 618 | (q)ibm-aix-meso            0.55/~100% | 
|---|
|  | 619 |  | 
|---|
| [3266] | 620 | (s)sgi-magique             5.3/~90% | 
|---|
| [3241] | 621 | -------------------------------------------------------------------------------------------- | 
|---|
|  | 622 |  | 
|---|
|  | 623 |  | 
|---|
|  | 624 | K/ Efficacite de gestion de lock (mutex) avec les threads et tableaux | 
|---|
| [3187] | 625 | ---------------------------------------------------------------------- | 
|---|
| [3241] | 626 | (32 threads - operant sur 2000 vecteurs ~ 64000 lock/unlock/wait/broadcast) | 
|---|
|  | 627 |  | 
|---|
|  | 628 | (1) time zthr syncp 32 2000 4 | 
|---|
|  | 629 | (2) time zthr sync 32 2000 4 | 
|---|
|  | 630 | (1) time zthr syncp 4 15000 130 | 
|---|
|  | 631 | (2) time zthr sync  4 15000 130 | 
|---|
|  | 632 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 633 | CPU/Elap/%       (1)             (2)              (3)              (4) | 
|---|
|  | 634 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 635 | (a)xeon-lx-2.4GHz         23.5/14/168%    4.3/1.2/365%     7.9/5.5/142%      4/2.15/190% | 
|---|
|  | 636 | Avant ThSafeOp      17/178% | 
|---|
| [3187] | 637 | avec -O3 -g | 
|---|
|  | 638 | (b)xeon-lx-2.8GHz (2) | 
|---|
| [3273] | 639 | (bb4)xeon-4c-grid50       1/1/96%         1.6/1.2/133%     4.9/4.05/120%     6/5/122% | 
|---|
| [3241] | 640 | (c)amd-lx                 0.6/1/63%        0.6/1/60%       3.5/3.5/102%      2.6/2.7/98% | 
|---|
| [3187] | 641 |  | 
|---|
| [3241] | 642 | (d)osf-asc               4.5/3.4/132%     3.35/2/170%     15.8/10.5/150%     13/8/163% | 
|---|
|  | 643 | 5.4/100%(NoThSafe) | 
|---|
|  | 644 | (e')cool                 1.3/1.37/95%     1.35/1.5/89%     5.3/5.3/99%       5.2/5.2/99% | 
|---|
| [3187] | 645 | (e)superosf (1) | 
|---|
| [3251] | 646 |  | 
|---|
|  | 647 |  | 
|---|
| [3259] | 648 | (g)G5-osx-2GHz (2)     2.6/130% (NoThSafe) | 
|---|
| [3251] | 649 | -O -g                2.6/1.7/150%      6.8/3.7/187%    4/2.75/142%       4.7/3/155% | 
|---|
|  | 650 | (h)G4-osx-1.25GHz (1)    40.5/42.6/95%    42.2/43.5/97%      [-g] | 
|---|
| [3244] | 651 | 3.9/4.3/89%      3.5/4/89%       3.8/4/95%         4.3/4.6/93% | 
|---|
| [3187] | 652 |  | 
|---|
| [3245] | 653 | (i)core-osx-1.83GHz      7.7/7.1/108%     7.8/6.7/116%    30.2/29.6/102%    30.3/29.2/104%   [-g] | 
|---|
|  | 654 | -O2        2.7/1.8/152%     6/3.16/190%     3.4/2.4/142%      3.2/2.5/164%     [-O2 -g] | 
|---|
| [3241] | 655 | (j)xeon-osx | 
|---|
|  | 656 | Avant ThSafeOp      2.55/143% | 
|---|
| [3187] | 657 |  | 
|---|
| [3241] | 658 | (p)ibm-aix-regatta        4.7/111% | 
|---|
|  | 659 | (q)ibm-aix-meso           7.5/2.8/300%     17/3.8/450%    8.2/3.05/270%      4.85/2.43/200% | 
|---|
|  | 660 |  | 
|---|
|  | 661 | -------------------------------------------------------------------------------------------- | 
|---|
|  | 662 |  | 
|---|
|  | 663 |  | 
|---|
|  | 664 |  | 
|---|
|  | 665 | L/ I/O et PPF | 
|---|
|  | 666 | ----------------- | 
|---|
|  | 667 | Ecriture/lecture de n=10^7 lignes de int+6double, Total ~ 500 MO | 
|---|
|  | 668 | (1) time tstdtable w xx.ppf swap 10000000 1024 0 | 
|---|
|  | 669 | (2) time tstdtable r xx.ppf swap 10000000 1024 0 | 
|---|
|  | 670 | (3) time tstdtable w xx.ppf swap 50000000 1024 0 | 
|---|
|  | 671 | (4) time tstdtable r xx.ppf swap 50000000 1024 0 | 
|---|
|  | 672 |  | 
|---|
|  | 673 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 674 | CPU/Elap/%       (1)              (2)              (3)                (4) | 
|---|
|  | 675 | ------------------------------------------------------------------------------------------- | 
|---|
|  | 676 | (a)xeon-lx-2.4GHz          17/26/63%       5.5/5.6/94% | 
|---|
| [3270] | 677 |  | 
|---|
|  | 678 | (b)xeon-lx-2.8GHz          7/18.5/40%      2.7/2.8/100%                                   7000000 | 
|---|
|  | 679 |  | 
|---|
| [3241] | 680 | (c)amd-lx                  5.9/6./97%      3.4/3.4/100%      30/32/93%      24/165/13% ? | 
|---|
|  | 681 |  | 
|---|
|  | 682 | (d)osf-asc | 
|---|
|  | 683 | (e')cool                   15/30/50%       13/13/99% | 
|---|
|  | 684 | (f)superosf | 
|---|
|  | 685 |  | 
|---|
| [3259] | 686 | (g)G5-osx-2GHz             14/14.2/98%     6/6.14/99% | 
|---|
| [3251] | 687 |  | 
|---|
|  | 688 | (h)G4-osx-1.25GHz          26/29.7/87%     15.7/38.6/41%               [-O2 -g] | 
|---|
| [3245] | 689 | (i)core-osx-1.83GHz        37/37.8/98%     22/48.4/45%                                 [-g] | 
|---|
|  | 690 | -O2          10.5/17.4/55%   11.2/41.2/27%                               [-O2 -g] | 
|---|
| [3241] | 691 | (p)ibm-aix-regatta | 
|---|
|  | 692 | (q)ibm-aix-meso           5.5/16.8/38%    5.7/13.2/43%      32.7/85/39%    29/60/49% | 
|---|
| [3270] | 693 | 6/11/55%        4/9/44% | 
|---|
|  | 694 | 6.3/10.6/60%    4/6/64% | 
|---|
|  | 695 | 2 lecture //                    ~2x  4/6/60% Elapsed 7 sec | 
|---|
|  | 696 | 1 write + 2 read //    6.3/10.4/60%  ~2 3.8/6.4/59%  Elapsed < 11 sec | 
|---|
|  | 697 | 1 write + 3 read //                                 19 sec | 
|---|
|  | 698 | 4 read //                                           6 sec | 
|---|
| [3241] | 699 | -------------------------------------------------------------------------------------------- | 
|---|
| [3258] | 700 |  | 
|---|
|  | 701 |  | 
|---|
|  | 702 |  | 
|---|