Statistiques
| Révision :

root / www / faqs.html

Historique | Voir | Annoter | Télécharger (5,74 ko)

1
<HTML>
2
<HEAD>
3
<TITLE>HPL Frequently Asked Questions</TITLE>
4
</HEAD>
5

    
6
<BODY 
7
BGCOLOR     = "WHITE"
8
BACKGROUND  = "WHITE"
9
TEXT        = "#000000"
10
VLINK       = "#000099"
11
ALINK       = "#947153"
12
LINK        = "#0000ff">
13

    
14
<H2>HPL Frequently Asked Questions</H2>
15

    
16
<UL>
17
<LI><A HREF="faqs.html#pbsize">What problem size N should I run ?</A>
18
<LI><A HREF="faqs.html#blsize">What block size NB should I use ?</A>
19
<LI><A HREF="faqs.html#grid">What process grid ratio P x Q should I use ?</A>
20
<LI><A HREF="faqs.html#1node">What about the one processor case ?</A>
21
<LI><A HREF="faqs.html#options">Why so many options in HPL.dat ?</A>
22
<LI><A HREF="faqs.html#outperf">Can HPL be outperformed ?</A>
23
</UL>
24
<HR NOSHADE
25

    
26
<H3<A ="pbsize">What problem size N should I run ?</A></H3>
27

    
28
In order  to find out  the  best performance   of  your  system,  the
29
largest   problem size  fitting in memory is what you should aim for.
30
The  amount  of  memory  used  by  HPL is essentially the size of the 
31
coefficient matrix.  So for example, if you have 4 nodes  with 256 Mb
32
of memory on each, this corresponds to 1 Gb total, i.e., 125 M double
33
precision  (8  bytes)  elements. The  square  root  of that number is
34
11585.  One  definitely needs to leave some memory for the OS as well
35
as for other things, so a problem size of 10000 is likely to fit.  As
36
a rule of thumb, 80 % of the  total amount of memory is a good guess.
37
If the problem size you pick is too large,  swapping will occur,  and
38
the performance will drop.  If multiple processes  are spawn  on each
39
node  (say  you have 2 processors  per  node),  what  counts  is  the
40
available amount of memory to each process.<BR><BR>
41
<HR NOSHADE
42

    
43
<H3<A ="blsize">What block size NB should I use ?</A></H3>
44

    
45
HPL  uses  the block size NB for the data distribution as well as for
46
the  computational  granularity.  From  a data distribution  point of
47
view,  the smallest NB,  the better the load balance.  You definitely
48
want  to stay away  from very large values of NB.  From a computation
49
point of view,  a too small value of NB  may  limit the computational
50
performance by a large factor because almost no data reuse will occur
51
in the highest level of the memory hierarchy. The  number of messages
52
will  also  increase.  Efficient  matrix-multiply  routines are often 
53
internally  blocked.  Small  multiples  of  this  blocking factor are 
54
likely to be good block sizes for HPL. The bottom line is that "good"
55
block sizes are almost always in the [32 .. 256] interval.  The  best
56
values depend on the computation / communication performance ratio of
57
your system. To a much less extent, the problem size matters as well.
58
Say for example,  you emperically found that 44 was a good block size
59
with respect to performance.  88 or 132  are likely  to give slightly 
60
better results  for large problem sizes because of a slighlty  higher
61
flop rate.<BR><BR>
62
<HR NOSHADE
63

    
64
<H3<A ="grid">What process grid ratio P x Q should I use ?</A></H3>
65

    
66
This  depends  on  the  physical  interconnection  network  you have.
67
Assuming a mesh or a switch HPL "likes" a 1:k ratio with k in [1..3].
68
In  other  words,  P  and  Q  should  be approximately equal,  with Q 
69
slightly larger than P. Examples: 2 x 2, 2 x 4, 2 x 5,  3 x 4, 4 x 4,
70
4 x 6, 5 x 6, 4 x 8 ...  If  you  are  running  on  a simple Ethernet 
71
network,  there  is  only one wire through which all the messages are
72
exchanged. On  such a network, the performance and scalability of HPL
73
is strongly limited  and very flat process grids are likely to be the
74
best choices: 1 x 4, 1 x 8, 2 x 4 ...<BR><BR>
75
<HR NOSHADE
76

    
77
<H3<A ="1node">What about the one processor case ?</A></H3>
78

    
79
HPL  has  been  designed  to  perform well for large problem sizes on
80
hundreds  of  nodes and more.  The software works on one node and for
81
large problem sizes, one  can usually achieve pretty good performance
82
on a single processor as well.  For small problem sizes  however, the
83
overhead  due  to  message-passing,  local  indexing and so on can be 
84
significant.<BR><BR>
85
<HR NOSHADE
86

    
87
<H3<A ="options">Why so many options in HPL.dat ?</A></H3>
88

    
89
There are quite a few reasons. First off, these options are useful to
90
determine what matters and what does not on your system. Second,  HPL
91
is often used in the context  of early evaluation of new systems.  In
92
such a case, everything is usually not quite working right, and it is
93
convenient  to be able  to vary these parameters without recompiling.
94
Finally,  every system has its own peculiarities and one is likely to
95
be  willing  to  emperically determine the best set of parameters. In
96
any   case,  one  can  always  follow  the  advice  provided  in  the
97
<A HREF = "tuning.html">tuning  section</A> of this  document and not
98
worry about the complexity of the input file.<BR><BR>
99
<HR NOSHADE
100

    
101
<H3<A ="outperf">Can HPL be Outperformed ?</A></H3>
102

    
103
Certainly.   There  is  always  room  for  performance  improvements.
104
Specific knowledge about  a  particular system  is always a source of
105
performance   gains.  Even  from  a generic  point  of  view,  better
106
algorithms  or  more  efficient  formulation  of the classic ones are
107
potential winners.<BR><BR>
108

    
109
<HR NOSHADE
110
<CENTER
111
<A  = "index.html">            [Home]</A>
112
<A HREF = "copyright.html">        [Copyright and Licensing Terms]</A>
113
<A HREF = "algorithm.html">        [Algorithm]</A>
114
<A HREF = "scalability.html">      [Scalability]</A>
115
<A HREF = "results.html">          [Performance Results]</A>
116
<A HREF = "documentation.html">    [Documentation]</A>
117
<A HREF = "software.html">         [Software]</A>
118
<A HREF = "faqs.html">             [FAQs]</A>
119
<A HREF = "tuning.html">           [Tuning]</A>
120
<A HREF = "errata.html">           [Errata-Bugs]</A>
121
<A HREF = "references.html">       [References]</A>
122
<A HREF = "links.html">            [Related Links]</A><BR>
123
</CENTER>
124
<HR NOSHADE
125
</BODY
126
</HTML