Changeset 3241 in Sophya
- Timestamp:
- May 7, 2007, 1:54:01 PM (18 years ago)
- File:
-
- 1 edited
Legend:
- Unmodified
- Added
- Removed
-
trunk/SophyaLib/Manual/perfmachine.txt
r3187 r3241 8 8 (b) ccali : Bipro-bicoeur Xeon@2.8 GHz Linux (xeon-lx-2.8GHz) , icc 8.0 ou 9.0 9 9 (c) sgsda: AMD Bipro AMD opteron 248 @ 2.2 GHz (amd-lx- 10 (cc) grid-saclay A:MD opteron 275 Bipro-bicoeur @ 2.2 GHz (amd275-lx)10 (cc) grid-saclay: AMD opteron 275 Bipro-bicoeur @ 2.2 GHz (amd275-lx) 11 11 12 12 (d) asc: bipro alpha (@ ~1 GHz) server DS20 OSF (osf1) , cxx 6.5 (osf-asc) 13 13 (nouveau asc 420 MFLOPS, moins puissante que l'ancien asc 800 MFLOPS) 14 14 (e) xp1000-dapnia: alpha xp1000 @ ~ 600 MHz ? OSF1 , cxx ? (osf-xp1000) 15 (e') cool: alpha xp1000 @ ~ 667 MHz ? OSF1 5.1 , cxx 6.3 15 16 (f) superosf-dapnia: multi-proc alphaServer ES80 6 procs EV7 @ 1 GHz (super-osf) 16 17 … … 21 22 22 23 (p) IBM-AIX regatta , xlC , IBM eServer pSeries 655 , 8 proc power4 @ 1.1 GHz 23 24 (q) IBM-AIX meso , AIX 5.3, xlC V8 , IBM Power5 , 8 proc bi-coeur P575 @ 1.9 GHz 25 26 (s) SGI-IRIX64 magique, CC 24 27 25 28 NOTES : … … 31 34 32 35 Donnees SPECint2000 (3) / SPECfp2000 (2) (http//www.spec.org) 33 (1) MFLOPS -> cpupower 2 36 (1) MFLOPS -> cpupower 2 (x/y : -O -g / -O3) 34 37 ---------------------------------------------------------------------- 35 38 MFLOPS(1) SPECfp SPECint … … 40 43 41 44 (d)osf-xp1000 648 500 400 42 ( e)superosf 842 1100 70045 (f)superosf 842 1100 700 43 46 44 47 (i)core-osx-1.83GHz 855 1400 1500 45 48 (j)xeon-osx 2600 2900 - 46 49 47 (p)ibm-aix 700 1050 700 50 (p)ibm-aix-regatta 730/1750 1050 700 51 (p)ibm-aix-meso 1250/3600 48 52 ---------------------------------------------------------------------- 49 53 … … 66 70 avec -O3 -g (2) 1300 s 77 760 s 172% 67 71 (b)xeon-lx-2.8GHz (2) 755 s 132 540 s 140% 68 (c)amd-lx 69 70 (d)osf- xp1000 (1) 533 s 187 660 s 80%71 (e) superosf (1) 895 s 112 910 s 98%72 (f) osf-asc (1) 1920 s 52 2340 s 83% (??)72 (c)amd-lx (2) 336 s 297 175 s 192% 73 74 (d)osf-asc (1) 1920 s 52 2340 s 83% (??) 75 (e)osf-xp1000 (1) 533 s 187 660 s 80% 76 (f)superosf (1) 895 s 112 910 s 98% 73 77 74 78 (f)G4-osx-1.25GHz (1) 660 s 151 710 s 93% … … 92 96 93 97 B.1/ arr = c1*a1+c2*a2 94 (1) time cpupower 2 # compile avec -O3 98 (1) time cpupower 2 # compile avec -O3 (/ -O -g) 95 99 (2) time zthr arr 1 1000 1 thread 96 100 (3) time zthr arr 2 1000 2 thread … … 115 119 (6)19.8/6.5/300% 116 120 117 (d)osf-xp1000 648 5.1/5.3/96.6% 11.4/11.4/99% 121 (d)osf-asc 420 6.3s/6.5s/99% 16.9/8.8/192% 122 (4)29.9/15.7/191% 123 (e)osf-xp1000 648 5.1/5.3/96.6% 11.4/11.4/99% 118 124 (4)25.2/25.5/99% 119 ( e)superosf 842 2.87/2.88/99.6% 6.25/4.1/153%125 (f)superosf 842 2.87/2.88/99.6% 6.25/4.1/153% 120 126 (4)11.6/3.06/379% 121 (f)osf-asc 420 6.3s/6.5s/99% 16.9/8.8/192%122 (4)29.9/15.7/191%123 127 124 128 (f)G4-osx-1.25GHz 333 44s/48s/91% 86.7/99.8/92% … … 136 140 (5) 17.4/4.77/365% 137 141 138 (p)ibm-aix-regatta 7006.8/6.9/98% 13.1/6.75/195%142 (p)ibm-aix-regatta 1750/730 6.8/6.9/98% 13.1/6.75/195% 139 143 (4) 26.3/11.7/225% 140 141 (q)sgi-magique 460 60/60/99% 144 (q)ibm-aix-meso 3600/1250 3.6/3.75/96% 7.35/3.7/197% 145 (4) 12.46/4.2/298% 146 (5) 219/6.7/280% 147 (6) 24/4.5/530% 148 149 150 (s)sgi-magique 460 60/60/99% 142 151 ----------------------------------------------------------------------------------- 143 152 … … 145 154 B.2/ Multiplication de matrices mtx = mtx1 * mtx2 146 155 147 (1) time cpupower 2 156 (1) time cpupower 2 (-O3 / -O -g) 148 157 (2) time zthr mtx 1 1000 1 thread 149 158 (3) time zthr mtx 2 1000 2 thread … … 172 181 173 182 174 (d)osf-xp1000 648 13/14.1/92% 27.1/27.4/99% 183 (d)osf-asc 420 13.5s/13.7s/98% 32/16.5/194% 184 (4) 67.5/34.4/196% 185 (e)osf-xp1000 648 13/14.1/92% 27.1/27.4/99% 175 186 (4) 54/54.7/99.6% 176 187 (5) 80.6/81/99.6% 177 188 (6) 107.8/108.3/99.5% 178 ( e)superosf 842 6.1/7.24/84% 12.35/6.29/196%189 (f)superosf 842 6.1/7.24/84% 12.35/6.29/196% 179 190 (4) 24.3/6.31/385% 180 191 (5) 36.5/10.9/335% 181 192 (6) 50.1/18.15/276% 182 (f)osf-asc 420 13.5s/13.7s/98% 32/16.5/194%183 (4) 67.5/34.4/196%184 193 185 194 (f)G4-osx-1.25GHz 333 … … 199 208 (6) 62.8/17.38/362% 200 209 201 (p)ibm-aix-regatta 7009.5/9.7/98% 18.3/16.0/114%210 (p)ibm-aix-regatta 1750/730 9.5/9.7/98% 18.3/16.0/114% 202 211 (4) 38.3/24.7/155% 203 204 (q)sgi-magique 460 49/49/99% 101/56/181% 212 (p)ibm-aix-meso 3600/1250 2.3/2.3/99% 5.1/2.64/194% (compil avec -O3) 213 (4) 11.4/4.16/272% 214 (5) 20.2/5.85/344% 215 (6) 29.9/6.74/442% 216 217 (s)sgi-magique 460 49/49/99% 101/56/181% 205 218 206 219 ----------------------------------------------------------------------------------- … … 222 235 -O3 -g 6.8/7.1/96% 147 223 236 (b)xeon-lx-2.8GHz 920 224 (c)amd-lx 690 237 (c)amd-lx 690 3.2/3.75/86% 4.2/4.2/100% 238 225 238 ~2x 4.7/4.7/99% 226 239 (cc)amd2-lx 675 2.8/2.8/99% 3.56/3.58/99% 227 240 228 (d)osf-xp1000 648 9.9/10.2/97% 101 9.3/9.46/98.5% 107 229 (e)superosf 842 6./6.22/96.7% 166 5.1/5.18/97.4% 196 230 (f)osf-asc 420 13.3/13.9/95.7% 75 12.2/17.6/70% 82 241 (d)osf-asc 420 13.3/13.9/95.7% 75 12.2/17.6/70% 82 242 (e)osf-xp1000 648 9.9/10.2/97% 101 9.3/9.46/98.5% 107 243 (e')cool 9.1/9.3/98% 244 (f)superosf 842 6./6.22/96.7% 166 5.1/5.18/97.4% 196 231 245 232 246 (f)G4-osx-1.25GHz 333 15.2/15.9/96% 66 23.8/34.1/70% 42 … … 240 254 4 jobs // 241 255 242 (p)ibm-aix-regatta 700 6.25/18.9/33% 160 5.25/15.7/33% 190 243 244 (q)sgi-magique 460 22/22/98% 42 24.5/25/99% 40 256 (p)ibm-aix-regatta 1750/730 6.25/18.9/33% 160 5.25/15.7/33% 190 257 (q)ibm-aix-meso 3600/1250 3.95/4.3/91% 250 3.82/4./94% 260 258 2 jobs // ~2x 3.88/4.2/92% 250 259 260 261 (s)sgi-magique 460 22/22/98% 42 24.5/25/99% 40 245 262 ----------------------------------------------------------------------------------- 246 263 247 264 248 D/ Efficacite de gestion de lock (mutex) avec les threads 249 ----------------------------------------------------------- 250 (32 threads - operant sur 2000 vecteurs ~ 64000 lock/unlock/wait/broadcast) 251 252 csh> time zthr syncp 32 2000 4 253 (1) time cpupower 2 254 255 ---------------------------------------------------------------------- 256 (1)MFLOPS CPU(s) IndPerf TCPU/Elapsed % 257 ---------------------------------------------------------------------- 258 (a)xeon-lx-2.4GHz 1167 17.8 178% 259 avec -O3 -g 260 (b)xeon-lx-2.8GHz (2) 261 (c)amd-lx 690 0.4 31% 262 263 (d)osf-xp1000 (1) 264 (e)superosf (1) 265 (f)osf-asc (1) 420 5.4 100% 266 267 (f)G4-osx-1.25GHz (1) 333 64 96% 268 (h)G5-osx-1GHz (2) 1150 2.6 130% 265 D/ Calcul inversion par lapack 266 ------------------------------- 267 268 lpk inverse 1000,1000 0 269 ---> temps de calcul inversion par lapack 270 ------------------------------------------------------------------------------------------- 271 CPU/Elap/% (1) 272 ------------------------------------------------------------------------------------------- 273 (a)xeon-lx-2.4GHz 5.6/~100% 274 (b)xeon-lx-2.8GHz 275 (c)amd-lx 5.5/5.5/99% 276 277 (d)osf-asc 278 (e')cool 2.8/2.9/95% 279 (f)superosf 280 281 (f)G4-osx-1.25GHz 282 (h)G5-osx-1GHz 0.8/~100% 269 283 -tune=G5 270 284 (i)core-osx-1.83GHz 271 285 -O2 272 (j)xeon-osx 2600 2.55 143% 273 274 (p)ibm-aix 700 4.7 111% 275 ---------------------------------------------------------------------- 286 (p)ibm-aix-regatta 287 (q)ibm-aix-meso 0.55/~100% 288 289 -------------------------------------------------------------------------------------------- 290 291 292 K/ Efficacite de gestion de lock (mutex) avec les threads et tableaux 293 ---------------------------------------------------------------------- 294 (32 threads - operant sur 2000 vecteurs ~ 64000 lock/unlock/wait/broadcast) 295 296 (1) time zthr syncp 32 2000 4 297 (2) time zthr sync 32 2000 4 298 (1) time zthr syncp 4 15000 130 299 (2) time zthr sync 4 15000 130 300 ------------------------------------------------------------------------------------------- 301 CPU/Elap/% (1) (2) (3) (4) 302 ------------------------------------------------------------------------------------------- 303 (a)xeon-lx-2.4GHz 23.5/14/168% 4.3/1.2/365% 7.9/5.5/142% 4/2.15/190% 304 Avant ThSafeOp 17/178% 305 avec -O3 -g 306 (b)xeon-lx-2.8GHz (2) 307 (c)amd-lx 0.6/1/63% 0.6/1/60% 3.5/3.5/102% 2.6/2.7/98% 308 309 (d)osf-asc 4.5/3.4/132% 3.35/2/170% 15.8/10.5/150% 13/8/163% 310 5.4/100%(NoThSafe) 311 (e')cool 1.3/1.37/95% 1.35/1.5/89% 5.3/5.3/99% 5.2/5.2/99% 312 (e)superosf (1) 313 ( 314 (f)G4-osx-1.25GHz (1) 40.5/42.6/95% 42.2/43.5/97% 315 316 (h)G5-osx-1GHz (2) 2.6/130% (NoThSafe) 317 -tune=G5 318 (i)core-osx-1.83GHz 319 -O2 320 (j)xeon-osx 321 Avant ThSafeOp 2.55/143% 322 323 (p)ibm-aix-regatta 4.7/111% 324 (q)ibm-aix-meso 7.5/2.8/300% 17/3.8/450% 8.2/3.05/270% 4.85/2.43/200% 325 326 -------------------------------------------------------------------------------------------- 327 328 329 330 L/ I/O et PPF 331 ----------------- 332 Ecriture/lecture de n=10^7 lignes de int+6double, Total ~ 500 MO 333 (1) time tstdtable w xx.ppf swap 10000000 1024 0 334 (2) time tstdtable r xx.ppf swap 10000000 1024 0 335 (3) time tstdtable w xx.ppf swap 50000000 1024 0 336 (4) time tstdtable r xx.ppf swap 50000000 1024 0 337 338 ------------------------------------------------------------------------------------------- 339 CPU/Elap/% (1) (2) (3) (4) 340 ------------------------------------------------------------------------------------------- 341 (a)xeon-lx-2.4GHz 17/26/63% 5.5/5.6/94% 342 (b)xeon-lx-2.8GHz 343 (c)amd-lx 5.9/6./97% 3.4/3.4/100% 30/32/93% 24/165/13% ? 344 345 (d)osf-asc 346 (e')cool 15/30/50% 13/13/99% 347 (f)superosf 348 349 (f)G4-osx-1.25GHz 350 (h)G5-osx-1GHz 351 -tune=G5 352 (i)core-osx-1.83GHz 353 -O2 354 (p)ibm-aix-regatta 355 (q)ibm-aix-meso 5.5/16.8/38% 5.7/13.2/43% 32.7/85/39% 29/60/49% 356 357 --------------------------------------------------------------------------------------------
Note:
See TracChangeset
for help on using the changeset viewer.