Context Navigation

← Previous Revision
Latest Revision
Next Revision →
Normal
Revision Log

source: Sophya/trunk/Eval/JET/results.txt@ 2368

Visit:

Last change on this file since 2368 was 2368, checked in by ansari, 22 years ago
Fichiers resultats ajoute - Reza 23/04/2003
File size: 4.3 KB

Rev	Line
[2368]	1
	2	Performances classes tableaux en C++ et expression templates
	3	------------------------------------------------------------
	4
	5	<<<< Comparaison/resultats du 23/04/2003 >>>>
	6
	7	A/ Performances globales / en particulier acces aux elements
	8	B/ Apport des ET (Expression Templates) (JET)
	9	C/ Comparaison avec le code fortran (f77) et le BLAS
	10	D/ Comparaison avec tableaux de SOPHYA (TMatrix<r_8>)
	11
	12	E/ Liste d'operations
	13	1- (note ElAcc) Remplissage de tableau avec acces aux elements de type
	14	Matrix mx(nrow, ncol)
	15	mx(i,j) = expression (i, j, ...)
	16
	17	ElAcc C++, surcharge d'operateur (i,j)
	18	ElAcc fortran: tableaux natif fortran
	19
	20	2- Operation de type (note CMAdd), c1,c2,c3 trois constantes
	21	mx = mx1c1 + mx2c2 + mx3*c3
	22
	23
	24	* Boucle 50 fois sur Tableaux 1000x500 *
	25
	26
	27	Linux:
	28	======
	29	Programme fortran , compile avec g77 -O3
	30	Programme C++, compile avec -O3, SOPHYA avec flags habituels (-g -O ?)
	31	>> eros3> uname -a
	32	>> Linux eros3 2.4.18 #10 SMP Mon Dec 16 12:45:16 CET 2002 i686 unknown
	33	>> Intel(R) Xeon(TM) CPU 2.40GHz
	34
	35	Programmes: lx_fmtx , lx_tjet / Commandes:
	36	csh> time lx_fmtx
	37	csh> time lx_tjet 50 1000 500
	38
	39
	40	Tru64/OSF:
	41	==========
	42	BLAS optimise de Compaq(/DEC -> HP) -lcxml
	43	Programmes compile avec -O3 -> osfO3_fmtx osfO3_tjet
	44	-arch host -fat -> ascfast_fmtx ascfast_tjet
	45	fortran avec -O5 -> osfO5_fmtx
	46
	47	>> asc.lal.in2p3.fr> uname -a
	48	>> OSF1 asc.lal.in2p3.fr V5.1 2650 alpha
	49	>> ES47 Chip Alpha EV7 @ 1 GHz
	50
	51
	52	====================================================
	53	Temps CPU en secondes
	54	====================================================
	55
	56
	57	<ElAcc>:
	58	ElAcc_1 : fortran, acces natif tableaux 2-D
	59	C/C++, pointeur double * p = new double[size]; acces p[i]
	60	ElAcc_2 : Classe SimpleMatrix<T> / surcharge d'operateur
	61	ElAcc_3 : Classe SOPHYA::TMatrix<T> / surcharge d'operateur
	62
	63	------------------------------------------------------------
	64	ElAcc_1 ElAcc_2 ElAcc_3
	65	------------------------------------------------------------
	66	lx_fmtx: 1.59
	67	lx_tjet: 0.92 0.84 0.9
	68	............................................................
	69	osfO3_fmtx: 3.31
	70	osfO5_fmtx: 0.46
	71	ascfast_fmtx: 3.21
	72	............................................................
	73	osfO3_tjet: 1.0 1.06 1.03
	74	ascfast_tjet: 0.66 0.73 0.70
	75	------------------------------------------------------------
	76
	77
	78	<CMAdd>
	79	CMAdd_1 : fortran, boucle + acces natif tableaux 2-D
	80	C/C++, pointeur double * p = new double[size]; boucle p[i] = q[i] ....
	81	CMAdd_2 : fortran / appel BLAS (copy/ CstMult/ VecAdd)
	82	CMAdd_3 : C++/SimpleMatrix<T>::MultCst() / AddElt()
	83	CMAdd_4 : SOPHYA::TMatrix<T>::MultCst() / AddElt()
	84	CMAdd_5 : C++/JET : SimpleMatrix<T>:: operator overload with Exp. Templates
	85	CMAdd_6 : SOPHYA::TMatrix<T>::operator overlaod
	86	CMAdd_5, CMAdd_6 : mx = mx1c1 + mx2c2 + mx3*c3
	87
	88	--------------------------------------------------------------------------
	89	CMAdd_1 CMAdd_2 CMAdd_3 CMAdd_4 CMAdd_5 CMAdd_6
	90	--------------------------------------------------------------------------
	91	lx_fmtx: 2.58 1.11
	92	lx_tjet: 0.62 2.26 0.52 2.22 4.04
	93	..........................................................................
	94	osfO3_fmtx: 4.31 0.63
	95	osfO5_fmtx: 0.26 0.71
	96	ascfast_fmtx: 4.13 0.63
	97	..........................................................................
	98	osfO3_tjet: 0.65 1.48 1.36 3.08 3.53
	99	ascfast_tjet: 1.06 1.91 1.84 3.81 3.83
	100	--------------------------------------------------------------------------
	101
	102
	103	Notes:
	104	1/ performances f77 -O5 doivent etre reverifie: l'optimiseur fortran
	105	(/f90) fait des optimisations sauvages dans certains cas, en virant
	106	les boucles que l'on veut tester. En effet, suivant le niveau d'optimisation,
	107	le fortran de DEC/Compaq (/HP) arrive a faire tendre le temps de calcul
	108	vers zero dans certains cas, independant de la taille des tableaux !
	109
	110	2/ l'optimisation -arch host -fast (code d'execution rapide adapte a
	111	l'architecture du processeur de la machine hote) n'arrive pas a
	112	ameliorer les performances - Cela est peut-etre imputable a
	113	la librairie SOPHYA, compile avec -g -O , pour proc alpha generique.

Note: See TracBrowser for help on using the repository browser.

Download in other formats: