root / www / results.html @ 1
Historique | Voir | Annoter | Télécharger (7,63 ko)
1 |
<HTML>
|
---|---|
2 |
<HEAD>
|
3 |
<TITLE>HPL Results</TITLE> |
4 |
</HEAD>
|
5 |
|
6 |
<BODY
|
7 |
BGCOLOR = "WHITE" |
8 |
BACKGROUND = "WHITE" |
9 |
TEXT = "#000000" |
10 |
VLINK = "#000099" |
11 |
ALINK = "#947153" |
12 |
LINK = "#0000ff"> |
13 |
|
14 |
<TABLE HSPACE=0 VSPACE=0 WIDTH=100% BORDER=0 CELLSPACING=1 CELLPADDING=0> |
15 |
<TR><TD ALIGN=LEFT VALIGN=LEFT> |
16 |
<IMG SRC = "aprunner.gif" BORDER=0 HEIGHT=160 WIDTH=220> |
17 |
</TD>
|
18 |
<TD ALIGN=LEFT VALIGN=LEFT> |
19 |
<H2>HPL Performance Results</H2> |
20 |
|
21 |
<STRONG>
|
22 |
The performance achieved by this software package on a few machine |
23 |
configurations is shown below. These results are only provided for |
24 |
illustrative purposes. By the time you read this, those systems |
25 |
have changed, they may not even exist anymore and one can surely |
26 |
not exactly reproduce the state in which these machines were when |
27 |
those measurements have been obtained. To obtain accurate figures |
28 |
on your system, it is absolutely necessary to |
29 |
<A HREF = "software.html">download the software</A> and run it there. |
30 |
</STRONG>
|
31 |
</TD>
|
32 |
</TR></TABLE> |
33 |
<HR NOSHADE |
34 |
|
35 |
<TABLE =0 =0 =100% BORDER=0 CELLSPACING=1 CELLPADDING=0><TR> |
36 |
<TD><UL> |
37 |
<LI><A HREF = "results.html#AMD_K7000">Athlon 4-nodes cluster</A> |
38 |
</UL></TD><TD><UL> |
39 |
<LI><A HREF = "results.html#I550p3000">Intel PIII 8-duals cluster</A> |
40 |
</UL></TD><TD><UL> |
41 |
<LI><A HREF = "results.html#compaq000">Compaq 64 nodes AlphaServer SC</A> |
42 |
</UL></TD> |
43 |
</TR></TABLE> |
44 |
<HR NOSHADE |
45 |
|
46 |
<H3<A ="AMD_K7000">4 AMD Athlon K7 500 Mhz (256 Mb) - (2x) 100 Mbs |
47 |
Switched - 2 NICs per node (channel bonding)</A></H3> |
48 |
|
49 |
<CENTER>
|
50 |
<TABLE BORDER |
51 |
<TR<TDOS </TD<TDLinux RedHat (Kernel ) </TD</TR |
52 |
<TR<TDC </TD<TDgcc (egcs-2.91.66 release) </TD</TR |
53 |
<TR<TDC </TD<TD-fomit-frame-pointer -funroll-loops</TD</TR |
54 |
<TR<TDMPI </TD<TDMPIch </TD</TR |
55 |
<TR<TDBLAS </TD<TDATLAS (Version beta) </TD</TR |
56 |
<TR<TDComments </TD<TD09 / 00 </TD</TR |
57 |
</TABLE<P |
58 |
|
59 |
<TABLE > |
60 |
<TR>
|
61 |
<TH ALIGN=CENTER> GRID</TH> |
62 |
<TH ALIGN=CENTER> 2000</TH> |
63 |
<TH ALIGN=CENTER> 5000</TH> |
64 |
<TH ALIGN=CENTER> 8000</TH> |
65 |
<TH ALIGN=CENTER>10000</TH> |
66 |
</TR>
|
67 |
<TR>
|
68 |
<TH ALIGN=CENTER>1 x 4</TH> |
69 |
<TD ALIGN=CENTER> 1.28</TD> |
70 |
<TD ALIGN=CENTER> 1.73</TD> |
71 |
<TD ALIGN=CENTER> 1.89</TD> |
72 |
<TD ALIGN=CENTER> 1.95</TD> |
73 |
</TR>
|
74 |
<TR>
|
75 |
<TH ALIGN=CENTER>2 x 2</TH> |
76 |
<TD ALIGN=CENTER> 1.17</TD> |
77 |
<TD ALIGN=CENTER> 1.68</TD> |
78 |
<TD ALIGN=CENTER> 1.88</TD> |
79 |
<TD ALIGN=CENTER> 1.93</TD> |
80 |
</TR>
|
81 |
<TR>
|
82 |
<TH ALIGN=CENTER>4 x 1</TH> |
83 |
<TD ALIGN=CENTER> 0.81</TD> |
84 |
<TD ALIGN=CENTER> 1.43</TD> |
85 |
<TD ALIGN=CENTER> 1.70</TD> |
86 |
<TD ALIGN=CENTER> 1.80</TD> |
87 |
</TR>
|
88 |
Performance (Gflops) w.r.t Problem size on 4 nodes. |
89 |
</TABLE><P> |
90 |
</CENTER>
|
91 |
|
92 |
<HR NOSHADE |
93 |
<H3<A ="I550p3000">8 Duals Intel PIII 550 Mhz (512 Mb) - Myrinet</A></H3> |
94 |
|
95 |
<CENTER>
|
96 |
<TABLE BORDER |
97 |
<TR<TDOS </TD<TDLinux RedHat (Kernel ) </TD</TR |
98 |
<TR<TDC </TD<TDgcc (egcs-2.91.66 release) </TD</TR |
99 |
<TR<TDC </TD<TD-fomit-frame-pointer -funroll-loops</TD</TR |
100 |
<TR<TDMPI </TD<TDMPI (Version ) </TD</TR |
101 |
<TR<TDBLAS </TD<TDATLAS (Version beta) </TD</TR |
102 |
<TR<TDComments </TD |
103 |
<TD<A ="http://icl.cs.utk.edu">UTK / ICL</A> - Torc cluster - 09 / 00</TD> |
104 |
</TR>
|
105 |
</TABLE><P> |
106 |
|
107 |
<TABLE BORDER |
108 |
<TR |
109 |
<TH =CENTER GRID</TH |
110 |
<TH =CENTER 2000</TH |
111 |
<TH =CENTER 5000</TH |
112 |
<TH =CENTER 8000</TH |
113 |
<TH =CENTER10000</TH |
114 |
<TH =CENTER15000</TH |
115 |
<TH =CENTER20000</TH |
116 |
</TR |
117 |
<TR |
118 |
<TH =CENTER2 4</TH |
119 |
<TD =CENTER 1.76</TD |
120 |
<TD =CENTER 2.32</TD |
121 |
<TD =CENTER 2.51</TD |
122 |
<TD =CENTER 2.58</TD |
123 |
<TD =CENTER 2.72</TD |
124 |
<TD =CENTER 2.73</TD |
125 |
</TR |
126 |
<TR |
127 |
<TH =CENTER4 4</TH |
128 |
<TD =CENTER 2.27</TD |
129 |
<TD =CENTER 3.94</TD |
130 |
<TD =CENTER 4.46</TD |
131 |
<TD =CENTER 4.68</TD |
132 |
<TD =CENTER 5.00</TD |
133 |
<TD =CENTER 5.16</TD |
134 |
</TR |
135 |
Performance (Gflops) w.r.t size 8- 16-processors |
136 |
</TABLE<P |
137 |
</CENTER |
138 |
|
139 |
<HR > |
140 |
<H3><A NAME="compaq000">Compaq 64 nodes (4 ev67 667 Mhz processors per node) |
141 |
AlphaServer SC</A></H3> |
142 |
|
143 |
<CENTER>
|
144 |
<TABLE BORDER |
145 |
<TR<TDOS </TD<TDTru64 5 </TD</TR |
146 |
<TR<TDC </TD<TDcc 6.1 </TD</TR |
147 |
<TR<TDC </TD<TD-arch -tune -std </TD</TR |
148 |
<TR<TDMPI </TD<TD-lmpi </TD</TR |
149 |
<TR<TDBLAS </TD<TDCXML </TD</TR |
150 |
<TR<TDComments </TD |
151 |
<TD<A = "http://www.ccs.ornl.gov/ccs">ORNL / CCS</A> |
152 |
- falcon - 09 / 00</TD></TR> |
153 |
</TABLE><P> |
154 |
</CENTER>
|
155 |
|
156 |
In the table below, each row corresponds to a given number of cpus (or |
157 |
processors) and nodes. The first row for example is denoted by 1 / 1, |
158 |
i.e., 1 cpu / 1 node. Rmax is given in Gflops, and the value of Nmax |
159 |
in fact corresponds to 351 Mb per cpu for all machine configurations.<BR><BR> |
160 |
|
161 |
<CENTER>
|
162 |
<TABLE BORDER |
163 |
<TR |
164 |
<TH =CENTER CPUS / NODES </TH |
165 |
<TH =CENTER GRID </TH |
166 |
<TH =CENTER N /2 </TH |
167 |
<TH =CENTER Nmax </TH |
168 |
<TH =CENTER Rmax (Gflops) </TH |
169 |
<TH =CENTER Parallel </TH |
170 |
</TR |
171 |
<TR |
172 |
<TH =CENTER 1 / 1 </TH |
173 |
<TH =CENTER 1 1 </TH |
174 |
<TD =CENTER 150 </TD |
175 |
<TD =CENTER 6625 </TD |
176 |
<TD =CENTER 1.136 </TD |
177 |
<TD =CENTER 1.000 </TD |
178 |
</TR |
179 |
<TR |
180 |
<TH =CENTER 4 / 1 </TH |
181 |
<TH =CENTER 2 2 </TH |
182 |
<TD =CENTER 800 </TD |
183 |
<TD =CENTER 13250 </TD |
184 |
<TD =CENTER 4.360 </TD |
185 |
<TD =CENTER 0.960 </TD |
186 |
</TR |
187 |
<TR |
188 |
<TH =CENTER 16 / 4 </TH |
189 |
<TH =CENTER 4 4 </TH |
190 |
<TD =CENTER 2300 </TD |
191 |
<TD =CENTER 26500 </TD |
192 |
<TD =CENTER 17.00 </TD |
193 |
<TD =CENTER 0.935 </TD |
194 |
</TR |
195 |
<TR |
196 |
<TH =CENTER 64 / 16 </TH |
197 |
<TH =CENTER 8 8 </TH |
198 |
<TD =CENTER 5700 </TD |
199 |
<TD =CENTER 53000 </TD |
200 |
<TD =CENTER 67.50 </TD |
201 |
<TD =CENTER 0.928 </TD |
202 |
</TR |
203 |
<TR |
204 |
<TH =CENTER 256 / 64 </TH |
205 |
<TH =CENTER 16 16 </TH |
206 |
<TD =CENTER 14000 </TD |
207 |
<TD =CENTER 106000 </TD |
208 |
<TD =CENTER 263.6 </TD |
209 |
<TD =CENTER 0.906 </TD |
210 |
</TR |
211 |
</TABLE<P |
212 |
</CENTER |
213 |
For shown the , the efficiency cpu been |
214 |
using performance by on cpu. is , |
215 |
since CXML multiply was at 1.24 |
216 |
for matrix on cpu, it have difficult a |
217 |
Linpack implementation achieve more |
218 |
1.136 on same For load (as the 351 |
219 |
per for ), HPL almost as should. |
220 |
|
221 |
<BR<BR |
222 |
The acknowledge use the Ridge Laboratory |
223 |
computer, funded the of 's |
224 |
of and Efficiency <BR<BR |
225 |
|
226 |
<HR > |
227 |
<CENTER>
|
228 |
<A HREF = "index.html"> [Home]</A> |
229 |
<A HREF = "copyright.html"> [Copyright and Licensing Terms]</A> |
230 |
<A HREF = "algorithm.html"> [Algorithm]</A> |
231 |
<A HREF = "scalability.html"> [Scalability]</A> |
232 |
<A HREF = "results.html"> [Performance Results]</A> |
233 |
<A HREF = "documentation.html"> [Documentation]</A> |
234 |
<A HREF = "software.html"> [Software]</A> |
235 |
<A HREF = "faqs.html"> [FAQs]</A> |
236 |
<A HREF = "tuning.html"> [Tuning]</A> |
237 |
<A HREF = "errata.html"> [Errata-Bugs]</A> |
238 |
<A HREF = "references.html"> [References]</A> |
239 |
<A HREF = "links.html"> [Related Links]</A><BR> |
240 |
</CENTER>
|
241 |
<HR NOSHADE |
242 |
</BODY |
243 |
</HTML |