Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Blame
Revision Log

source: Sophya/trunk/Eval/JET/results.txt@ 2368

Visit:

Last change on this file since 2368 was 2368, checked in by ansari, 22 years ago
Fichiers resultats ajoute - Reza 23/04/2003
File size: 4.3 KB

Line
1
2	Performances classes tableaux en C++ et expression templates
3	------------------------------------------------------------
4
5	<<<< Comparaison/resultats du 23/04/2003 >>>>
6
7	A/ Performances globales / en particulier acces aux elements
8	B/ Apport des ET (Expression Templates) (JET)
9	C/ Comparaison avec le code fortran (f77) et le BLAS
10	D/ Comparaison avec tableaux de SOPHYA (TMatrix<r_8>)
11
12	E/ Liste d'operations
13	1- (note ElAcc) Remplissage de tableau avec acces aux elements de type
14	Matrix mx(nrow, ncol)
15	mx(i,j) = expression (i, j, ...)
16
17	ElAcc C++, surcharge d'operateur (i,j)
18	ElAcc fortran: tableaux natif fortran
19
20	2- Operation de type (note CMAdd), c1,c2,c3 trois constantes
21	mx = mx1c1 + mx2c2 + mx3*c3
22
23
24	* Boucle 50 fois sur Tableaux 1000x500 *
25
26
27	Linux:
28	======
29	Programme fortran , compile avec g77 -O3
30	Programme C++, compile avec -O3, SOPHYA avec flags habituels (-g -O ?)
31	>> eros3> uname -a
32	>> Linux eros3 2.4.18 #10 SMP Mon Dec 16 12:45:16 CET 2002 i686 unknown
33	>> Intel(R) Xeon(TM) CPU 2.40GHz
34
35	Programmes: lx_fmtx , lx_tjet / Commandes:
36	csh> time lx_fmtx
37	csh> time lx_tjet 50 1000 500
38
39
40	Tru64/OSF:
41	==========
42	BLAS optimise de Compaq(/DEC -> HP) -lcxml
43	Programmes compile avec -O3 -> osfO3_fmtx osfO3_tjet
44	-arch host -fat -> ascfast_fmtx ascfast_tjet
45	fortran avec -O5 -> osfO5_fmtx
46
47	>> asc.lal.in2p3.fr> uname -a
48	>> OSF1 asc.lal.in2p3.fr V5.1 2650 alpha
49	>> ES47 Chip Alpha EV7 @ 1 GHz
50
51
52	====================================================
53	Temps CPU en secondes
54	====================================================
55
56
57	<ElAcc>:
58	ElAcc_1 : fortran, acces natif tableaux 2-D
59	C/C++, pointeur double * p = new double[size]; acces p[i]
60	ElAcc_2 : Classe SimpleMatrix<T> / surcharge d'operateur
61	ElAcc_3 : Classe SOPHYA::TMatrix<T> / surcharge d'operateur
62
63	------------------------------------------------------------
64	ElAcc_1 ElAcc_2 ElAcc_3
65	------------------------------------------------------------
66	lx_fmtx: 1.59
67	lx_tjet: 0.92 0.84 0.9
68	............................................................
69	osfO3_fmtx: 3.31
70	osfO5_fmtx: 0.46
71	ascfast_fmtx: 3.21
72	............................................................
73	osfO3_tjet: 1.0 1.06 1.03
74	ascfast_tjet: 0.66 0.73 0.70
75	------------------------------------------------------------
76
77
78	<CMAdd>
79	CMAdd_1 : fortran, boucle + acces natif tableaux 2-D
80	C/C++, pointeur double * p = new double[size]; boucle p[i] = q[i] ....
81	CMAdd_2 : fortran / appel BLAS (copy/ CstMult/ VecAdd)
82	CMAdd_3 : C++/SimpleMatrix<T>::MultCst() / AddElt()
83	CMAdd_4 : SOPHYA::TMatrix<T>::MultCst() / AddElt()
84	CMAdd_5 : C++/JET : SimpleMatrix<T>:: operator overload with Exp. Templates
85	CMAdd_6 : SOPHYA::TMatrix<T>::operator overlaod
86	CMAdd_5, CMAdd_6 : mx = mx1c1 + mx2c2 + mx3*c3
87
88	--------------------------------------------------------------------------
89	CMAdd_1 CMAdd_2 CMAdd_3 CMAdd_4 CMAdd_5 CMAdd_6
90	--------------------------------------------------------------------------
91	lx_fmtx: 2.58 1.11
92	lx_tjet: 0.62 2.26 0.52 2.22 4.04
93	..........................................................................
94	osfO3_fmtx: 4.31 0.63
95	osfO5_fmtx: 0.26 0.71
96	ascfast_fmtx: 4.13 0.63
97	..........................................................................
98	osfO3_tjet: 0.65 1.48 1.36 3.08 3.53
99	ascfast_tjet: 1.06 1.91 1.84 3.81 3.83
100	--------------------------------------------------------------------------
101
102
103	Notes:
104	1/ performances f77 -O5 doivent etre reverifie: l'optimiseur fortran
105	(/f90) fait des optimisations sauvages dans certains cas, en virant
106	les boucles que l'on veut tester. En effet, suivant le niveau d'optimisation,
107	le fortran de DEC/Compaq (/HP) arrive a faire tendre le temps de calcul
108	vers zero dans certains cas, independant de la taille des tableaux !
109
110	2/ l'optimisation -arch host -fast (code d'execution rapide adapte a
111	l'architecture du processeur de la machine hote) n'arrive pas a
112	ameliorer les performances - Cela est peut-etre imputable a
113	la librairie SOPHYA, compile avec -g -O , pour proc alpha generique.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: