root / www / faqs.html
Historique | Voir | Annoter | Télécharger (5,74 ko)
1 |
<HTML>
|
---|---|
2 |
<HEAD>
|
3 |
<TITLE>HPL Frequently Asked Questions</TITLE> |
4 |
</HEAD>
|
5 |
|
6 |
<BODY
|
7 |
BGCOLOR = "WHITE" |
8 |
BACKGROUND = "WHITE" |
9 |
TEXT = "#000000" |
10 |
VLINK = "#000099" |
11 |
ALINK = "#947153" |
12 |
LINK = "#0000ff"> |
13 |
|
14 |
<H2>HPL Frequently Asked Questions</H2> |
15 |
|
16 |
<UL>
|
17 |
<LI><A HREF="faqs.html#pbsize">What problem size N should I run ?</A> |
18 |
<LI><A HREF="faqs.html#blsize">What block size NB should I use ?</A> |
19 |
<LI><A HREF="faqs.html#grid">What process grid ratio P x Q should I use ?</A> |
20 |
<LI><A HREF="faqs.html#1node">What about the one processor case ?</A> |
21 |
<LI><A HREF="faqs.html#options">Why so many options in HPL.dat ?</A> |
22 |
<LI><A HREF="faqs.html#outperf">Can HPL be outperformed ?</A> |
23 |
</UL>
|
24 |
<HR NOSHADE |
25 |
|
26 |
<H3<A ="pbsize">What problem size N should I run ?</A></H3> |
27 |
|
28 |
In order to find out the best performance of your system, the |
29 |
largest problem size fitting in memory is what you should aim for. |
30 |
The amount of memory used by HPL is essentially the size of the |
31 |
coefficient matrix. So for example, if you have 4 nodes with 256 Mb |
32 |
of memory on each, this corresponds to 1 Gb total, i.e., 125 M double |
33 |
precision (8 bytes) elements. The square root of that number is |
34 |
11585. One definitely needs to leave some memory for the OS as well |
35 |
as for other things, so a problem size of 10000 is likely to fit. As |
36 |
a rule of thumb, 80 % of the total amount of memory is a good guess. |
37 |
If the problem size you pick is too large, swapping will occur, and |
38 |
the performance will drop. If multiple processes are spawn on each |
39 |
node (say you have 2 processors per node), what counts is the |
40 |
available amount of memory to each process.<BR><BR> |
41 |
<HR NOSHADE |
42 |
|
43 |
<H3<A ="blsize">What block size NB should I use ?</A></H3> |
44 |
|
45 |
HPL uses the block size NB for the data distribution as well as for |
46 |
the computational granularity. From a data distribution point of |
47 |
view, the smallest NB, the better the load balance. You definitely |
48 |
want to stay away from very large values of NB. From a computation |
49 |
point of view, a too small value of NB may limit the computational |
50 |
performance by a large factor because almost no data reuse will occur |
51 |
in the highest level of the memory hierarchy. The number of messages |
52 |
will also increase. Efficient matrix-multiply routines are often |
53 |
internally blocked. Small multiples of this blocking factor are |
54 |
likely to be good block sizes for HPL. The bottom line is that "good" |
55 |
block sizes are almost always in the [32 .. 256] interval. The best |
56 |
values depend on the computation / communication performance ratio of |
57 |
your system. To a much less extent, the problem size matters as well. |
58 |
Say for example, you emperically found that 44 was a good block size |
59 |
with respect to performance. 88 or 132 are likely to give slightly |
60 |
better results for large problem sizes because of a slighlty higher |
61 |
flop rate.<BR><BR> |
62 |
<HR NOSHADE |
63 |
|
64 |
<H3<A ="grid">What process grid ratio P x Q should I use ?</A></H3> |
65 |
|
66 |
This depends on the physical interconnection network you have. |
67 |
Assuming a mesh or a switch HPL "likes" a 1:k ratio with k in [1..3]. |
68 |
In other words, P and Q should be approximately equal, with Q |
69 |
slightly larger than P. Examples: 2 x 2, 2 x 4, 2 x 5, 3 x 4, 4 x 4, |
70 |
4 x 6, 5 x 6, 4 x 8 ... If you are running on a simple Ethernet |
71 |
network, there is only one wire through which all the messages are |
72 |
exchanged. On such a network, the performance and scalability of HPL |
73 |
is strongly limited and very flat process grids are likely to be the |
74 |
best choices: 1 x 4, 1 x 8, 2 x 4 ...<BR><BR> |
75 |
<HR NOSHADE |
76 |
|
77 |
<H3<A ="1node">What about the one processor case ?</A></H3> |
78 |
|
79 |
HPL has been designed to perform well for large problem sizes on |
80 |
hundreds of nodes and more. The software works on one node and for |
81 |
large problem sizes, one can usually achieve pretty good performance |
82 |
on a single processor as well. For small problem sizes however, the |
83 |
overhead due to message-passing, local indexing and so on can be |
84 |
significant.<BR><BR> |
85 |
<HR NOSHADE |
86 |
|
87 |
<H3<A ="options">Why so many options in HPL.dat ?</A></H3> |
88 |
|
89 |
There are quite a few reasons. First off, these options are useful to |
90 |
determine what matters and what does not on your system. Second, HPL |
91 |
is often used in the context of early evaluation of new systems. In |
92 |
such a case, everything is usually not quite working right, and it is |
93 |
convenient to be able to vary these parameters without recompiling. |
94 |
Finally, every system has its own peculiarities and one is likely to |
95 |
be willing to emperically determine the best set of parameters. In |
96 |
any case, one can always follow the advice provided in the |
97 |
<A HREF = "tuning.html">tuning section</A> of this document and not |
98 |
worry about the complexity of the input file.<BR><BR> |
99 |
<HR NOSHADE |
100 |
|
101 |
<H3<A ="outperf">Can HPL be Outperformed ?</A></H3> |
102 |
|
103 |
Certainly. There is always room for performance improvements. |
104 |
Specific knowledge about a particular system is always a source of |
105 |
performance gains. Even from a generic point of view, better |
106 |
algorithms or more efficient formulation of the classic ones are |
107 |
potential winners.<BR><BR> |
108 |
|
109 |
<HR NOSHADE |
110 |
<CENTER |
111 |
<A = "index.html"> [Home]</A> |
112 |
<A HREF = "copyright.html"> [Copyright and Licensing Terms]</A> |
113 |
<A HREF = "algorithm.html"> [Algorithm]</A> |
114 |
<A HREF = "scalability.html"> [Scalability]</A> |
115 |
<A HREF = "results.html"> [Performance Results]</A> |
116 |
<A HREF = "documentation.html"> [Documentation]</A> |
117 |
<A HREF = "software.html"> [Software]</A> |
118 |
<A HREF = "faqs.html"> [FAQs]</A> |
119 |
<A HREF = "tuning.html"> [Tuning]</A> |
120 |
<A HREF = "errata.html"> [Errata-Bugs]</A> |
121 |
<A HREF = "references.html"> [References]</A> |
122 |
<A HREF = "links.html"> [Related Links]</A><BR> |
123 |
</CENTER>
|
124 |
<HR NOSHADE |
125 |
</BODY |
126 |
</HTML |