NOTE: This patch has been committed. The version below is
informational only (whitespace differences have been removed).
ChangeLog addition:
2006-01-09 Didier Verna <didier(a)lrde.epita.fr>
* 2.0/src/dd/cl/tsi.cl: Fix size and nsteps parameters.
* 2.0/txt/bench.didier: New.
GSC source patch:
Diff command: svn diff --diff-cmd /usr/bin/diff -x "-u -t -b -B -w"
Files affected: 2.0/src/dd/cl/tsi.cl 2.0/txt/bench.didier
Index: 2.0/txt/bench.didier
===================================================================
--- 2.0/txt/bench.didier (revision 0)
+++ 2.0/txt/bench.didier (revision 0)
@@ -0,0 +1,331 @@
+ BENCHMARKS RESULTS
+
+
+* Architecture
+
+Debian unstable.
+
+** uname -a
+
+Linux uzeb 2.4.27-2-686-smp #1 SMP Wed Nov 30 21:47:06 JST 2005 i686 GNU/Linux
+
+Note the SMP flag: the CPU has hyperthreading turned on; the OS sees two
+virtual processors.
+
+** cat /proc/cpuinfo
+
+processor : 0
+vendor_id : GenuineIntel
+cpu family : 15
+model : 3
+model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
+stepping : 4
+cpu MHz : 2992.789
+cache size : 1024 KB
+fdiv_bug : no
+hlt_bug : no
+f00f_bug : no
+coma_bug : no
+fpu : yes
+fpu_exception : yes
+cpuid level : 5
+wp : yes
+flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl cid
+bogomips : 5976.88
+
+processor : 1
+vendor_id : GenuineIntel
+cpu family : 15
+model : 3
+model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
+stepping : 4
+cpu MHz : 2992.789
+cache size : 1024 KB
+fdiv_bug : no
+hlt_bug : no
+f00f_bug : no
+coma_bug : no
+fpu : yes
+fpu_exception : yes
+cpuid level : 5
+wp : yes
+flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36
clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe pni monitor ds_cpl cid
+bogomips : 5976.88
+
+
+
+* Benchmarks
+
+** Environment
+
+*** C
+
+gcc (GCC) 4.0.3 20051201 (prerelease) (Debian 4.0.2-5)
+Copyright (C) 2005 Free Software Foundation, Inc.
+This is free software; see the source for copying conditions. There is NO
+warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
+
+gcc -03 -DNDEBUG
+
+*** Java
+
+java version "1.5.0_06"
+Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)
+Java HotSpot(TM) Client VM (build 1.5.0_06-b05, mixed mode, sharing)
+
+javac: idem
+
+*** Eiffel
+
+SmartEiffel The GNU Eiffel Compiler, Eiffel tools and libraries
+Release 1.1 Release (Monday June 16th 2003) [Charlemagne]
+Copyright (C), 1994-2003 - INRIA - LORIA - UHP - Nancy 2 - FRANCE
+D.COLNET, S.COLLIN, O.ZENDRA, P.RIBET, C.ADRIAN - SmartEiffel(a)loria.fr
+http://SmartEiffel.loria.fr
+
+compile_to_c -boost / then C compilation
+
+*** Common Lisp
+
+XEmacs / Slime / CMU-CL
+
+CMU Common Lisp CVS 19c 19c-release + minimal debian patches (19C), running on
+uzeb With core: /usr/lib/cmucl/lisp.core
+Dumped on: Mon, 2005-12-12 10:05:58+01:00 on uzeb
+
+Loaded subsystems:
+ Python 1.1, target Intel x86
+ CLOS based on Gerd's PCL 2004/04/14 03:32:47
+
+
+** Dedicated versions
+
+*** C
+
+Linear / 800x800 / 200 steps:
+SB: 0.31
+MB: 0.33
+
+Randomized / 800x800 / 200 steps:
+SB: 10.82
+MB: 8.86
+
+
+*** Java
+
+Linear / 800x800 / 200 steps:
+SB: 0.35
+MB: 0.57
+
+Randomized / 800x800 / 200 steps:
+SB: 11.00
+MB: 9.10
+
+
+*** Eiffel
+
+Linear / 800x800 / 200 steps:
+SB: 0.40
+MB: 1.06
+
+Randomized / 800x800 / 200 steps:
+SB: 11.04
+MB: 9.00
+
+*** Common Lisp
+
+;;; Optimized benches for 200 step(s):
+;; Linear / Untyped / Multibuffer:
+; Evaluation took:
+; 2.48 seconds of real time
+; 2.49 seconds of user run time
+; 0.0 seconds of system run time
+; 7,429,347,345 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Untyped / Singlebuffer / AREF:
+; Evaluation took:
+; 1.32 seconds of real time
+; 1.29 seconds of user run time
+; 0.02 seconds of system run time
+; 3,949,483,710 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Untyped / Singlebuffer / SVREF:
+; Evaluation took:
+; 1.12 seconds of real time
+; 1.11 seconds of user run time
+; 0.0 seconds of system run time
+; 3,345,343,913 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+;;; Optimized benches for 200 step(s):
+;; Randomized / Untyped / Multibuffer:
+; Evaluation took:
+; 42.61 seconds of real time
+; 42.55 seconds of user run time
+; 0.0 seconds of system run time
+; 127,510,882,845 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Untyped / Singlebuffer / AREF:
+; Evaluation took:
+; 38.69 seconds of real time
+; 38.66 seconds of user run time
+; 0.01 seconds of system run time
+; 115,775,643,570 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Untyped / Singlebuffer / SVREF:
+; Evaluation took:
+; 38.76 seconds of real time
+; 38.63 seconds of user run time
+; 0.11 seconds of system run time
+; 116,005,607,213 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+;;; Optimized benches for 200 step(s):
+;; Linear / Typed / Multibuffer:
+; Evaluation took:
+; 2.82 seconds of real time
+; 2.83 seconds of user run time
+; 0.0 seconds of system run time
+; 8,467,002,300 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Typed / Singlebuffer / AREF:
+; Evaluation took:
+; 0.55 seconds of real time
+; 0.55 seconds of user run time
+; 0.0 seconds of system run time
+; 1,636,237,845 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Typed / Singlebuffer / SVREF:
+; Evaluation took:
+; 0.55 seconds of real time
+; 0.52 seconds of user run time
+; 0.03 seconds of system run time
+; 1,651,320,795 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+;;; Optimized benches for 200 step(s):
+;; Randomized / Typed / Multibuffer:
+; Evaluation took:
+; 28.21 seconds of real time
+; 28.18 seconds of user run time
+; 0.03 seconds of system run time
+; 84,420,916,523 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Typed / Singlebuffer / AREF:
+; Evaluation took:
+; 18.4 seconds of real time
+; 18.37 seconds of user run time
+; 0.01 seconds of system run time
+; 55,038,771,285 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Typed / Singlebuffer / SVREF:
+; Evaluation took:
+; 19.32 seconds of real time
+; 19.16 seconds of user run time
+; 0.04 seconds of system run time
+; 57,815,967,854 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+;;; Optimized benches for 200 step(s):
+;; Linear / Typed / Sized / Multibuffer:
+; Evaluation took:
+; 1.09 seconds of real time
+; 1.09 seconds of user run time
+; 0.01 seconds of system run time
+; 3,289,851,878 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Typed / Sized / Singlebuffer / AREF:
+; Evaluation took:
+; 0.54 seconds of real time
+; 0.53 seconds of user run time
+; 0.01 seconds of system run time
+; 1,613,471,783 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Linear / Typed / Sized / Singlebuffer / SVREF:
+; Evaluation took:
+; 0.55 seconds of real time
+; 0.55 seconds of user run time
+; 0.0 seconds of system run time
+; 1,659,704,692 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+;;; Optimized benches for 200 step(s):
+;; Randomized / Typed / Sized / Multibuffer:
+; Evaluation took:
+; 22.03 seconds of real time
+; 22.02 seconds of user run time
+; 0.01 seconds of system run time
+; 65,924,021,535 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Typed / Sized / Singlebuffer / AREF:
+; Evaluation took:
+; 19.19 seconds of real time
+; 19.16 seconds of user run time
+; 0.02 seconds of system run time
+; 57,427,619,017 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+;; Randomized / Typed / Sized / Singlebuffer / SVREF:
+; Evaluation took:
+; 19.22 seconds of real time
+; 19.17 seconds of user run time
+; 0.08 seconds of system run time
+; 57,512,349,082 CPU cycles
+; 0 page faults and
+; 0 bytes consed.
+;
+
+
+*** Summary
+
+ Linear Randomized
+C 0.31 - 0.33 10.82 - 8.86
+Java 0.35 - 0.57 11.00 - 9.10
+Eiffel 0.40 - 1.06 11.04 - 9.00
+Commom Lisp 0.55 - 1.10 18.38 - 22.03
+
+Note for Common Lisp: the best MB version is the sized one, and makes a big
+difference. The best randomized MB version is also the sized one.
+
+Single buffer versions: randomized time ~ 30 * linear time
+Multi buffer versions: randomized time ~ 20 * linear time
+
+
+
+
+Local Variables:
+mode: outline
+End:
Index: 2.0/src/dd/cl/tsi.cl
===================================================================
--- 2.0/src/dd/cl/tsi.cl (revision 46)
+++ 2.0/src/dd/cl/tsi.cl (working copy)
@@ -17,10 +17,10 @@
"Prime number to randomize memory access.")
(eval-when (:compile-toplevel :load-toplevel :execute)
- (defvar *size* 1024
+ (defvar *size* 800
"Dimension for (square) images."))
-(defvar *nsteps* 100
+(defvar *nsteps* 200
"Number of times to repeat the algorithm.")
--
Didier Verna, didier(a)lrde.epita.fr,
http://www.lrde.epita.fr/~didier
EPITA / LRDE, 14-16 rue Voltaire Tel.+33 (1) 44 08 01 85
94276 Le Kremlin-BicĂȘtre, France Fax.+33 (1) 53 14 59 22 didier(a)xemacs.org