root / www / HPL_pdfact.html
Historique | Voir | Annoter | Télécharger (3,04 ko)
1 | 1 | equemene | <HTML>
|
---|---|---|---|
2 | 1 | equemene | <HEAD>
|
3 | 1 | equemene | <TITLE>HPL_pdfact HPL 2.0 Library Functions September 10, 2008</TITLE> |
4 | 1 | equemene | </HEAD>
|
5 | 1 | equemene | |
6 | 1 | equemene | <BODY BGCOLOR="WHITE" TEXT = "#000000" LINK = "#0000ff" VLINK = "#000099" |
7 | 1 | equemene | ALINK = "#ffff00"> |
8 | 1 | equemene | |
9 | 1 | equemene | <H1>Name</H1> |
10 | 1 | equemene | <B>HPL_pdfact</B> recursive panel factorization. |
11 | 1 | equemene | |
12 | 1 | equemene | <H1>Synopsis</H1> |
13 | 1 | equemene | <CODE>#include "hpl.h"</CODE><BR><BR> |
14 | 1 | equemene | <CODE>void</CODE> |
15 | 1 | equemene | <CODE>HPL_pdfact(</CODE> |
16 | 1 | equemene | <CODE>HPL_T_panel *</CODE> |
17 | 1 | equemene | <CODE>PANEL</CODE> |
18 | 1 | equemene | <CODE>);</CODE> |
19 | 1 | equemene | |
20 | 1 | equemene | <H1>Description</H1> |
21 | 1 | equemene | <B>HPL_pdfact</B> |
22 | 1 | equemene | recursively factorizes a 1-dimensional panel of columns. |
23 | 1 | equemene | The RPFACT function pointer specifies the recursive algorithm to be |
24 | 1 | equemene | used, either Crout, Left- or Right looking. NBMIN allows to vary the |
25 | 1 | equemene | recursive stopping criterium in terms of the number of columns in the |
26 | 1 | equemene | panel, and NDIV allow to specify the number of subpanels each panel |
27 | 1 | equemene | should be divided into. Usuallly a value of 2 will be chosen. Finally |
28 | 1 | equemene | PFACT is a function pointer specifying the non-recursive algorithm to |
29 | 1 | equemene | to be used on at most NBMIN columns. One can also choose here between |
30 | 1 | equemene | Crout, Left- or Right looking. Empirical tests seem to indicate that |
31 | 1 | equemene | values of 4 or 8 for NBMIN give the best results. |
32 | 1 | equemene | |
33 | 1 | equemene | Bi-directional exchange is used to perform the swap::broadcast |
34 | 1 | equemene | operations at once for one column in the panel. This results in a |
35 | 1 | equemene | lower number of slightly larger messages than usual. On P processes |
36 | 1 | equemene | and assuming bi-directional links, the running time of this function |
37 | 1 | equemene | can be approximated by (when N is equal to N0): |
38 | 1 | equemene | |
39 | 1 | equemene | N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) + |
40 | 1 | equemene | N0^2 * ( M - N0/3 ) * gam2-3 |
41 | 1 | equemene | |
42 | 1 | equemene | where M is the local number of rows of the panel, lat and bdwth are |
43 | 1 | equemene | the latency and bandwidth of the network for double precision real |
44 | 1 | equemene | words, and gam2-3 is an estimate of the Level 2 and Level 3 BLAS |
45 | 1 | equemene | rate of execution. The recursive algorithm allows indeed to almost |
46 | 1 | equemene | achieve Level 3 BLAS performance in the panel factorization. On a |
47 | 1 | equemene | large number of modern machines, this operation is however latency |
48 | 1 | equemene | bound, meaning that its cost can be estimated by only the latency |
49 | 1 | equemene | portion N0 * log_2(P) * lat. Mono-directional links will double this |
50 | 1 | equemene | communication cost. |
51 | 1 | equemene | |
52 | 1 | equemene | <H1>Arguments</H1> |
53 | 1 | equemene | <PRE>
|
54 | 1 | equemene | PANEL (local input/output) HPL_T_panel * |
55 | 1 | equemene | On entry, PANEL points to the data structure containing the |
56 | 1 | equemene | panel information. |
57 | 1 | equemene | </PRE>
|
58 | 1 | equemene | |
59 | 1 | equemene | <H1>See Also</H1> |
60 | 1 | equemene | <A HREF="HPL_dlocmax.html">HPL_dlocmax</A>, |
61 | 1 | equemene | <A HREF="HPL_dlocswpN.html">HPL_dlocswpN</A>, |
62 | 1 | equemene | <A HREF="HPL_dlocswpT.html">HPL_dlocswpT</A>, |
63 | 1 | equemene | <A HREF="HPL_pdmxswp.html">HPL_pdmxswp</A>, |
64 | 1 | equemene | <A HREF="HPL_pdpancrN.html">HPL_pdpancrN</A>, |
65 | 1 | equemene | <A HREF="HPL_pdpancrT.html">HPL_pdpancrT</A>, |
66 | 1 | equemene | <A HREF="HPL_pdpanllN.html">HPL_pdpanllN</A>, |
67 | 1 | equemene | <A HREF="HPL_pdpanllT.html">HPL_pdpanllT</A>, |
68 | 1 | equemene | <A HREF="HPL_pdpanrlN.html">HPL_pdpanrlN</A>, |
69 | 1 | equemene | <A HREF="HPL_pdpanrlT.html">HPL_pdpanrlT</A>, |
70 | 1 | equemene | <A HREF="HPL_pdrpancrN.html">HPL_pdrpancrN</A>, |
71 | 1 | equemene | <A HREF="HPL_pdrpancrT.html">HPL_pdrpancrT</A>, |
72 | 1 | equemene | <A HREF="HPL_pdrpanllN.html">HPL_pdrpanllN</A>, |
73 | 1 | equemene | <A HREF="HPL_pdrpanllT.html">HPL_pdrpanllT</A>, |
74 | 1 | equemene | <A HREF="HPL_pdrpanrlN.html">HPL_pdrpanrlN</A>, |
75 | 1 | equemene | <A HREF="HPL_pdrpanrlT.html">HPL_pdrpanrlT</A>. |
76 | 1 | equemene | |
77 | 1 | equemene | </BODY>
|
78 | 1 | equemene | </HTML> |