Statistiques
| Révision :

root / www / HPL_pdrpanllN.html

Historique | Voir | Annoter | Télécharger (3,33 ko)

1
<HTML>
2
<HEAD>
3
<TITLE>HPL_pdrpanllN HPL 2.0 Library Functions September 10, 2008</TITLE> 
4
</HEAD>
5

    
6
<BODY BGCOLOR="WHITE" TEXT = "#000000" LINK = "#0000ff" VLINK = "#000099"
7
      ALINK = "#ffff00">
8

    
9
<H1>Name</H1>
10
<B>HPL_pdrpanllN</B> Left-looking recursive panel factorization.
11

    
12
<H1>Synopsis</H1>
13
<CODE>#include "hpl.h"</CODE><BR><BR>
14
<CODE>void</CODE>
15
<CODE>HPL_pdrpanllN(</CODE>
16
<CODE>HPL_T_panel *</CODE>
17
<CODE>PANEL</CODE>,
18
<CODE>const int</CODE>
19
<CODE>M</CODE>,
20
<CODE>const int</CODE>
21
<CODE>N</CODE>,
22
<CODE>const int</CODE>
23
<CODE>ICOFF</CODE>,
24
<CODE>double *</CODE>
25
<CODE>WORK</CODE>
26
<CODE>);</CODE>
27

    
28
<H1>Description</H1>
29
<B>HPL_pdrpanllN</B>
30
recursively  factorizes  a panel  of columns using  the
31
recursive Left-looking variant of the one-dimensional algorithm.  The
32
lower triangular  N0-by-N0  upper block  of  the  panel  is stored in
33
no-transpose form (i.e. just like the input matrix itself).
34
 
35
Bi-directional  exchange  is  used  to  perform  the  swap::broadcast
36
operations  at once  for one column in the panel.  This  results in a
37
lower number of slightly larger  messages than usual.  On P processes
38
and assuming bi-directional links,  the running time of this function
39
can be approximated by (when N is equal to N0):                      
40
 
41
   N0 * log_2( P ) * ( lat + ( 2*N0 + 4 ) / bdwth ) +
42
   N0^2 * ( M - N0/3 ) * gam2-3
43
 
44
where M is the local number of rows of  the panel, lat and bdwth  are
45
the latency and bandwidth of the network for  double  precision  real
46
words,  and  gam2-3  is an estimate of the  Level 2 and Level 3  BLAS
47
rate of execution. The  recursive  algorithm  allows indeed to almost
48
achieve  Level 3 BLAS  performance  in the panel factorization.  On a
49
large  number of modern machines,  this  operation is however latency
50
bound,  meaning  that its cost can  be estimated  by only the latency
51
portion N0 * log_2(P) * lat.  Mono-directional links will double this
52
communication cost.
53

    
54
<H1>Arguments</H1>
55
<PRE>
56
PANEL   (local input/output)          HPL_T_panel *
57
        On entry,  PANEL  points to the data structure containing the
58
        panel information.
59
</PRE>
60
<PRE>
61
M       (local input)                 const int
62
        On entry,  M specifies the local number of rows of sub(A).
63
</PRE>
64
<PRE>
65
N       (local input)                 const int
66
        On entry,  N specifies the local number of columns of sub(A).
67
</PRE>
68
<PRE>
69
ICOFF   (global input)                const int
70
        On entry, ICOFF specifies the row and column offset of sub(A)
71
        in A.
72
</PRE>
73
<PRE>
74
WORK    (local workspace)             double *
75
        On entry, WORK  is a workarray of size at least 2*(4+2*N0).
76
</PRE>
77

    
78
<H1>See Also</H1>
79
<A HREF="HPL_dlocmax.html">HPL_dlocmax</A>,
80
<A HREF="HPL_dlocswpN.html">HPL_dlocswpN</A>,
81
<A HREF="HPL_dlocswpT.html">HPL_dlocswpT</A>,
82
<A HREF="HPL_pdmxswp.html">HPL_pdmxswp</A>,
83
<A HREF="HPL_pdpancrN.html">HPL_pdpancrN</A>,
84
<A HREF="HPL_pdpancrT.html">HPL_pdpancrT</A>,
85
<A HREF="HPL_pdpanllN.html">HPL_pdpanllN</A>,
86
<A HREF="HPL_pdpanllT.html">HPL_pdpanllT</A>,
87
<A HREF="HPL_pdpanrlN.html">HPL_pdpanrlN</A>,
88
<A HREF="HPL_pdpanrlT.html">HPL_pdpanrlT</A>,
89
<A HREF="HPL_pdrpancrN.html">HPL_pdrpancrN</A>,
90
<A HREF="HPL_pdrpancrT.html">HPL_pdrpancrT</A>,
91
<A HREF="HPL_pdrpanllT.html">HPL_pdrpanllT</A>,
92
<A HREF="HPL_pdrpanrlN.html">HPL_pdrpanrlN</A>,
93
<A HREF="HPL_pdrpanrlT.html">HPL_pdrpanrlT</A>,
94
<A HREF="HPL_pdfact.html">HPL_pdfact</A>.
95

    
96
</BODY>
97
</HTML>