root / www / HPL_plindx0.html
Historique | Voir | Annoter | Télécharger (7,34 ko)
1 |
<HTML>
|
---|---|
2 |
<HEAD>
|
3 |
<TITLE>HPL_plindx0 HPL 2.0 Library Functions September 10, 2008</TITLE> |
4 |
</HEAD>
|
5 |
|
6 |
<BODY BGCOLOR="WHITE" TEXT = "#000000" LINK = "#0000ff" VLINK = "#000099" |
7 |
ALINK = "#ffff00"> |
8 |
|
9 |
<H1>Name</H1> |
10 |
<B>HPL_plindx0</B> Compute local swapping index arrays. |
11 |
|
12 |
<H1>Synopsis</H1> |
13 |
<CODE>#include "hpl.h"</CODE><BR><BR> |
14 |
<CODE>void</CODE> |
15 |
<CODE>HPL_plindx0(</CODE> |
16 |
<CODE>HPL_T_panel *</CODE> |
17 |
<CODE>PANEL</CODE>, |
18 |
<CODE>const int</CODE> |
19 |
<CODE>K</CODE>, |
20 |
<CODE>int *</CODE> |
21 |
<CODE>IPID</CODE>, |
22 |
<CODE>int *</CODE> |
23 |
<CODE>LINDXA</CODE>, |
24 |
<CODE>int *</CODE> |
25 |
<CODE>LINDXAU</CODE>, |
26 |
<CODE>int *</CODE> |
27 |
<CODE>LLEN</CODE> |
28 |
<CODE>);</CODE> |
29 |
|
30 |
<H1>Description</H1> |
31 |
<B>HPL_plindx0</B> |
32 |
computes two local arrays LINDXA and LINDXAU containing |
33 |
the local source and final destination position resulting from the |
34 |
application of row interchanges. |
35 |
|
36 |
On entry, the array IPID of length K is such that the row of global |
37 |
index IPID(i) should be mapped onto row of global index IPID(i+1). |
38 |
Let IA be the global index of the first row to be swapped. For k in |
39 |
[0..K/2), the row of global index IPID(2*k) should be mapped onto the |
40 |
row of global index IPID(2*k+1). The question then, is to determine |
41 |
which rows should ultimately be part of U. |
42 |
|
43 |
First, some rows of the process ICURROW may be swapped locally. One |
44 |
of this row belongs to U, the other one belongs to my local piece of |
45 |
A. The other rows of the current block are swapped with remote rows |
46 |
and are thus not part of U. These rows however should be sent along, |
47 |
and grabbed by the other processes as we progress in the exchange |
48 |
phase. |
49 |
|
50 |
So, assume that I am ICURROW and consider a row of index IPID(2*i) |
51 |
that I own. If I own IPID(2*i+1) as well and IPID(2*i+1) - IA is less |
52 |
than N, this row is locally swapped and should be copied into U at |
53 |
the position IPID(2*i+1) - IA. No row will be exchanged for this one. |
54 |
If IPID(2*i+1)-IA is greater than N, then the row IPID(2*i) should be |
55 |
locally copied into my local piece of A at the position corresponding |
56 |
to the row of global index IPID(2*i+1). |
57 |
|
58 |
If the process ICURROW does not own IPID(2*i+1), then row IPID(2*i) |
59 |
is to be swapped away and strictly speaking does not belong to U, but |
60 |
to A remotely. Since this process will however send this array U, |
61 |
this row is copied into U, exactly where the row IPID(2*i+1) should |
62 |
go. For this, we search IPID for k1, such that IPID(2*k1) is equal to |
63 |
IPID(2*i+1); and row IPID(2*i) is to be copied in U at the position |
64 |
IPID(2*k1+1)-IA. |
65 |
|
66 |
It is thus important to put the rows that go into U, i.e., such that |
67 |
IPID(2*i+1) - IA is less than N at the begining of the array IPID. By |
68 |
doing so, U is formed, and the local copy is performed in just one |
69 |
sweep. |
70 |
|
71 |
Two lists LINDXA and LINDXAU are built. LINDXA contains the local |
72 |
index of the rows I have that should be copied. LINDXAU contains the |
73 |
local destination information: if LINDXAU(k) >= 0, row LINDXA(k) of A
|
74 |
is to be copied in U at position LINDXAU(k). Otherwise, row LINDXA(k) |
75 |
of A should be locally copied into A(-LINDXAU(k),:). In the process |
76 |
ICURROW, the initial packing algorithm proceeds as follows. |
77 |
|
78 |
for all entries in IPID, |
79 |
if IPID(2*i) is in ICURROW, |
80 |
if IPID(2*i+1) is in ICURROW, |
81 |
if( IPID(2*i+1) - IA < N )
|
82 |
save corresponding local position |
83 |
of this row (LINDXA); |
84 |
save local position (LINDXAU) in U |
85 |
where this row goes; |
86 |
[copy row IPID(2*i) in U at position |
87 |
IPID(2*i+1)-IA; ]; |
88 |
else |
89 |
save corresponding local position of |
90 |
this row (LINDXA); |
91 |
save local position (-LINDXAU) in A |
92 |
where this row goes; |
93 |
[copy row IPID(2*i) in my piece of A |
94 |
at IPID(2*i+1);] |
95 |
end if |
96 |
else |
97 |
find k1 such that IPID(2*k1) = IPID(2*i+1); |
98 |
copy row IPID(2*i) in U at position |
99 |
IPID(2*k1+1)-IA; |
100 |
save corresponding local position of this |
101 |
row (LINDXA); |
102 |
save local position (LINDXAU) in U where |
103 |
this row goes; |
104 |
end if |
105 |
end if |
106 |
end for |
107 |
|
108 |
Second, if I am not the current row process ICURROW, all source rows |
109 |
in IPID that I own are part of U. Indeed, they are swapped with one |
110 |
row of the current block of rows, and the main factorization |
111 |
algorithm proceeds one row after each other. The processes different |
112 |
from ICURROW, should exchange and accumulate those rows until they |
113 |
receive some data previously owned by the process ICURROW. |
114 |
|
115 |
In processes different from ICURROW, the initial packing algorithm |
116 |
proceeds as follows. Consider a row of global index IPID(2*i) that I |
117 |
own. When I will be receiving data previously owned by ICURROW, i.e., |
118 |
U, row IPID(2*i) should replace the row in U at pos. IPID(2*i+1)-IA, |
119 |
and this particular row of U should be first copied into my piece of |
120 |
A, at A(il,:), where il is the local row index corresponding to |
121 |
IPID(2*i). Now,initially, this row will be packed into workspace, say |
122 |
as the kth row of that work array. The following algorithm sets |
123 |
LINDXAU[k] to IPID(2*i+1)-IA, that is the position in U where the row |
124 |
should be copied. LINDXA(k) stores the local index in A where this |
125 |
row of U should be copied, i.e il. |
126 |
|
127 |
for all entries in IPID, |
128 |
if IPID(2*i) is not in ICURROW, |
129 |
copy row IPID(2*i) in work array; |
130 |
save corresponding local position |
131 |
of this row (LINDXA); |
132 |
save position (LINDXAU) in U where |
133 |
this row should be copied; |
134 |
end if |
135 |
end for |
136 |
|
137 |
Since we are at it, we also globally figure out how many rows every |
138 |
process has. That is necessary, because it would rather be cumbersome |
139 |
to figure it on the fly during the bi-directional exchange phase. |
140 |
This information is kept in the array LLEN of size NPROW. Also note |
141 |
that the arrays LINDXA and LINDXAU are of max length equal to 2*N. |
142 |
|
143 |
<H1>Arguments</H1> |
144 |
<PRE>
|
145 |
PANEL (local input/output) HPL_T_panel * |
146 |
On entry, PANEL points to the data structure containing the |
147 |
panel information. |
148 |
</PRE>
|
149 |
<PRE>
|
150 |
K (global input) const int |
151 |
On entry, K specifies the number of entries in IPID. K is at |
152 |
least 2*N, and at most 4*N. |
153 |
</PRE>
|
154 |
<PRE>
|
155 |
IPID (global input) int * |
156 |
On entry, IPID is an array of length K. The first K entries |
157 |
of that array contain the src and final destination resulting |
158 |
from the application of the interchanges. |
159 |
</PRE>
|
160 |
<PRE>
|
161 |
LINDXA (local output) int * |
162 |
On entry, LINDXA is an array of dimension 2*N. On exit, this |
163 |
array contains the local indexes of the rows of A I have that |
164 |
should be copied into U. |
165 |
</PRE>
|
166 |
<PRE>
|
167 |
LINDXAU (local output) int * |
168 |
On exit, LINDXAU is an array of dimension 2*N. On exit, this |
169 |
array contains the local destination information encoded as |
170 |
follows. If LINDXAU(k) >= 0, row LINDXA(k) of A is to be
|
171 |
copied in U at position LINDXAU(k). Otherwise, row LINDXA(k) |
172 |
of A should be locally copied into A(-LINDXAU(k),:). |
173 |
</PRE>
|
174 |
<PRE>
|
175 |
LLEN (global output) int * |
176 |
On entry, LLEN is an array of length NPROW. On exit, it |
177 |
contains how many rows every process has. |
178 |
</PRE>
|
179 |
|
180 |
<H1>See Also</H1> |
181 |
<A HREF="HPL_pdlaswp00N.html">HPL_pdlaswp00N</A>, |
182 |
<A HREF="HPL_pdlaswp00T.html">HPL_pdlaswp00T</A>, |
183 |
<A HREF="HPL_pdlaswp01N.html">HPL_pdlaswp01N</A>, |
184 |
<A HREF="HPL_pdlaswp01T.html">HPL_pdlaswp01T</A>. |
185 |
|
186 |
</BODY>
|
187 |
</HTML>
|