root / man / man3 / HPL_pdrpanllN.3
Historique | Voir | Annoter | Télécharger (2,63 ko)
1 | 1 | equemene | .TH HPL_pdrpanllN 3 "September 10, 2008" "HPL 2.0" "HPL Library Functions" |
---|---|---|---|
2 | 1 | equemene | .SH NAME |
3 | 1 | equemene | HPL_pdrpanllN \- Left-looking recursive panel factorization. |
4 | 1 | equemene | .SH SYNOPSIS |
5 | 1 | equemene | \fB\&#include "hpl.h"\fR |
6 | 1 | equemene | |
7 | 1 | equemene | \fB\&void\fR |
8 | 1 | equemene | \fB\&HPL_pdrpanllN(\fR |
9 | 1 | equemene | \fB\&HPL_T_panel *\fR |
10 | 1 | equemene | \fI\&PANEL\fR, |
11 | 1 | equemene | \fB\&const int\fR |
12 | 1 | equemene | \fI\&M\fR, |
13 | 1 | equemene | \fB\&const int\fR |
14 | 1 | equemene | \fI\&N\fR, |
15 | 1 | equemene | \fB\&const int\fR |
16 | 1 | equemene | \fI\&ICOFF\fR, |
17 | 1 | equemene | \fB\&double *\fR |
18 | 1 | equemene | \fI\&WORK\fR |
19 | 1 | equemene | \fB\&);\fR |
20 | 1 | equemene | .SH DESCRIPTION |
21 | 1 | equemene | \fB\&HPL_pdrpanllN\fR |
22 | 1 | equemene | recursively factorizes a panel of columns using the |
23 | 1 | equemene | recursive Left-looking variant of the one-dimensional algorithm. The |
24 | 1 | equemene | lower triangular N0-by-N0 upper block of the panel is stored in |
25 | 1 | equemene | no-transpose form (i.e. just like the input matrix itself). |
26 | 1 | equemene | |
27 | 1 | equemene | Bi-directional exchange is used to perform the swap::broadcast |
28 | 1 | equemene | operations at once for one column in the panel. This results in a |
29 | 1 | equemene | lower number of slightly larger messages than usual. On P processes |
30 | 1 | equemene | and assuming bi-directional links, the running time of this function |
31 | 1 | equemene | can be approximated by (when N is equal to N0): |
32 | 1 | equemene | |
33 | 1 | equemene | N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) + |
34 | 1 | equemene | N0^2 * ( M - N0/3 ) * gam2-3 |
35 | 1 | equemene | |
36 | 1 | equemene | where M is the local number of rows of the panel, lat and bdwth are |
37 | 1 | equemene | the latency and bandwidth of the network for double precision real |
38 | 1 | equemene | words, and gam2-3 is an estimate of the Level 2 and Level 3 BLAS |
39 | 1 | equemene | rate of execution. The recursive algorithm allows indeed to almost |
40 | 1 | equemene | achieve Level 3 BLAS performance in the panel factorization. On a |
41 | 1 | equemene | large number of modern machines, this operation is however latency |
42 | 1 | equemene | bound, meaning that its cost can be estimated by only the latency |
43 | 1 | equemene | portion N0 * log_2(P) * lat. Mono-directional links will double this |
44 | 1 | equemene | communication cost. |
45 | 1 | equemene | .SH ARGUMENTS |
46 | 1 | equemene | .TP 8 |
47 | 1 | equemene | PANEL (local input/output) HPL_T_panel * |
48 | 1 | equemene | On entry, PANEL points to the data structure containing the |
49 | 1 | equemene | panel information. |
50 | 1 | equemene | .TP 8 |
51 | 1 | equemene | M (local input) const int |
52 | 1 | equemene | On entry, M specifies the local number of rows of sub(A). |
53 | 1 | equemene | .TP 8 |
54 | 1 | equemene | N (local input) const int |
55 | 1 | equemene | On entry, N specifies the local number of columns of sub(A). |
56 | 1 | equemene | .TP 8 |
57 | 1 | equemene | ICOFF (global input) const int |
58 | 1 | equemene | On entry, ICOFF specifies the row and column offset of sub(A) |
59 | 1 | equemene | in A. |
60 | 1 | equemene | .TP 8 |
61 | 1 | equemene | WORK (local workspace) double * |
62 | 1 | equemene | On entry, WORK is a workarray of size at least 2*(4+2*N0). |
63 | 1 | equemene | .SH SEE ALSO |
64 | 1 | equemene | .BR HPL_dlocmax \ (3), |
65 | 1 | equemene | .BR HPL_dlocswpN \ (3), |
66 | 1 | equemene | .BR HPL_dlocswpT \ (3), |
67 | 1 | equemene | .BR HPL_pdmxswp \ (3), |
68 | 1 | equemene | .BR HPL_pdpancrN \ (3), |
69 | 1 | equemene | .BR HPL_pdpancrT \ (3), |
70 | 1 | equemene | .BR HPL_pdpanllN \ (3), |
71 | 1 | equemene | .BR HPL_pdpanllT \ (3), |
72 | 1 | equemene | .BR HPL_pdpanrlN \ (3), |
73 | 1 | equemene | .BR HPL_pdpanrlT \ (3), |
74 | 1 | equemene | .BR HPL_pdrpancrN \ (3), |
75 | 1 | equemene | .BR HPL_pdrpancrT \ (3), |
76 | 1 | equemene | .BR HPL_pdrpanllT \ (3), |
77 | 1 | equemene | .BR HPL_pdrpanrlN \ (3), |
78 | 1 | equemene | .BR HPL_pdrpanrlT \ (3), |
79 | 1 | equemene | .BR HPL_pdfact \ (3). |