Statistiques
| Révision :

root / www / HPL_plindx0.html

Historique | Voir | Annoter | Télécharger (7,34 ko)

1
<HTML>
2
<HEAD>
3
<TITLE>HPL_plindx0 HPL 2.0 Library Functions September 10, 2008</TITLE> 
4
</HEAD>
5

    
6
<BODY BGCOLOR="WHITE" TEXT = "#000000" LINK = "#0000ff" VLINK = "#000099"
7
      ALINK = "#ffff00">
8

    
9
<H1>Name</H1>
10
<B>HPL_plindx0</B> Compute local swapping index arrays.
11

    
12
<H1>Synopsis</H1>
13
<CODE>#include "hpl.h"</CODE><BR><BR>
14
<CODE>void</CODE>
15
<CODE>HPL_plindx0(</CODE>
16
<CODE>HPL_T_panel *</CODE>
17
<CODE>PANEL</CODE>,
18
<CODE>const int</CODE>
19
<CODE>K</CODE>,
20
<CODE>int *</CODE>
21
<CODE>IPID</CODE>,
22
<CODE>int *</CODE>
23
<CODE>LINDXA</CODE>,
24
<CODE>int *</CODE>
25
<CODE>LINDXAU</CODE>,
26
<CODE>int *</CODE>
27
<CODE>LLEN</CODE>
28
<CODE>);</CODE>
29

    
30
<H1>Description</H1>
31
<B>HPL_plindx0</B>
32
computes two local arrays  LINDXA and  LINDXAU  containing
33
the  local  source and final destination position  resulting from the
34
application of row interchanges.
35
 
36
On entry, the array  IPID  of length K is such that the row of global
37
index  IPID(i)  should be mapped onto row of global index  IPID(i+1).
38
Let  IA  be the global index of the first row to be swapped. For k in
39
[0..K/2), the row of global index IPID(2*k) should be mapped onto the
40
row of global index  IPID(2*k+1).  The question then, is to determine
41
which rows should ultimately be part of U.
42
 
43
First, some rows of the process ICURROW  may be swapped locally.  One
44
of this row belongs to U, the other one belongs to my local  piece of
45
A.  The other  rows of the current block are swapped with remote rows
46
and are thus not part of U. These rows however should be sent  along,
47
and  grabbed by the other processes  as we  progress in the  exchange
48
phase.
49
 
50
So, assume that I am  ICURROW  and consider a row of index  IPID(2*i)
51
that I own. If I own IPID(2*i+1) as well and IPID(2*i+1) - IA is less
52
than N,  this row is locally swapped and should be copied into  U  at
53
the position IPID(2*i+1) - IA. No row will be exchanged for this one.
54
If IPID(2*i+1)-IA is greater than N, then the row IPID(2*i) should be
55
locally copied into my local piece of A at the position corresponding
56
to the row of global index IPID(2*i+1).
57
 
58
If the process  ICURROW does not own  IPID(2*i+1), then row IPID(2*i)
59
is to be swapped away and strictly speaking does not belong to U, but
60
to  A  remotely.  Since this  process will however send this array U,
61
this row is  copied into  U, exactly where the row IPID(2*i+1) should
62
go. For this, we search IPID for k1, such that IPID(2*k1) is equal to
63
IPID(2*i+1); and row  IPID(2*i) is to be copied in U  at the position
64
IPID(2*k1+1)-IA.
65
 
66
It is thus  important to put the rows that go into U, i.e., such that
67
IPID(2*i+1) - IA is less than N at the begining of the array IPID. By
68
doing so,  U  is formed, and the local copy  is performed in just one
69
sweep.
70
 
71
Two lists  LINDXA  and  LINDXAU are built.  LINDXA contains the local
72
index of the rows I have that should be copied. LINDXAU  contains the
73
local destination information: if LINDXAU(k) >= 0, row LINDXA(k) of A
74
is to be copied in U at position LINDXAU(k). Otherwise, row LINDXA(k)
75
of A should be locally copied into A(-LINDXAU(k),:).  In the  process
76
ICURROW, the initial packing algorithm proceeds as follows.
77
 
78
  for all entries in IPID,
79
     if IPID(2*i) is in ICURROW,
80
        if IPID(2*i+1) is in ICURROW,
81
           if( IPID(2*i+1) - IA < N )
82
            save corresponding local position
83
            of this row (LINDXA);
84
            save local position (LINDXAU) in U
85
            where this row goes;
86
            [copy row IPID(2*i) in U at position
87
            IPID(2*i+1)-IA; ];
88
           else
89
            save corresponding local position of
90
            this row (LINDXA);
91
            save local position (-LINDXAU) in A
92
            where this row goes;
93
            [copy row IPID(2*i) in my piece of A
94
            at IPID(2*i+1);]
95
           end if
96
        else
97
           find k1 such that IPID(2*k1) = IPID(2*i+1);
98
           copy row IPID(2*i) in U at position
99
           IPID(2*k1+1)-IA;
100
           save corresponding local position of this
101
           row (LINDXA);
102
           save local position (LINDXAU) in U where
103
           this row goes;
104
        end if
105
     end if
106
  end for
107
 
108
Second, if I am not the current row process  ICURROW, all source rows
109
in IPID that I own are part of U. Indeed,  they  are swapped with one
110
row  of  the  current  block  of rows,  and  the  main  factorization
111
algorithm proceeds one row after each other.  The processes different
112
from ICURROW,  should  exchange and accumulate  those rows until they
113
receive some data previously owned by the process ICURROW.
114
 
115
In processes different from  ICURROW,  the  initial packing algorithm
116
proceeds as follows.  Consider a row of global index IPID(2*i) that I
117
own. When I will be receiving data previously owned by ICURROW, i.e.,
118
U, row IPID(2*i) should  replace the row in U at pos. IPID(2*i+1)-IA,
119
and  this particular row of U should be first copied into my piece of
120
A, at A(il,:),  where  il is the  local row  index  corresponding  to
121
IPID(2*i). Now,initially, this row will be packed into workspace, say
122
as the kth row of  that  work array.  The  following  algorithm  sets
123
LINDXAU[k] to IPID(2*i+1)-IA, that is the position in U where the row
124
should be copied. LINDXA(k) stores the local index in  A  where  this
125
row of U should be copied, i.e il.
126
 
127
  for all entries in IPID,
128
     if IPID(2*i) is not in ICURROW,
129
        copy row IPID(2*i) in work array;
130
        save corresponding local position
131
        of this row (LINDXA);
132
        save position (LINDXAU) in U where
133
        this row should be copied;
134
     end if
135
  end for
136
 
137
Since we are at it, we also globally figure  out  how many rows every
138
process has. That is necessary, because it would rather be cumbersome
139
to  figure it on  the fly  during the  bi-directional exchange phase.
140
This information is kept in the array  LLEN  of size NPROW. Also note
141
that the arrays LINDXA and LINDXAU are of max length equal to 2*N.
142

    
143
<H1>Arguments</H1>
144
<PRE>
145
PANEL   (local input/output)          HPL_T_panel *
146
        On entry,  PANEL  points to the data structure containing the
147
        panel information.
148
</PRE>
149
<PRE>
150
K       (global input)                const int
151
        On entry, K specifies the number of entries in IPID.  K is at
152
        least 2*N, and at most 4*N.
153
</PRE>
154
<PRE>
155
IPID    (global input)                int *
156
        On entry,  IPID  is an array of length K. The first K entries
157
        of that array contain the src and final destination resulting
158
        from the application of the interchanges.
159
</PRE>
160
<PRE>
161
LINDXA  (local output)                int *
162
        On entry, LINDXA  is an array of dimension 2*N. On exit, this
163
        array contains the local indexes of the rows of A I have that
164
        should be copied into U.
165
</PRE>
166
<PRE>
167
LINDXAU (local output)                int *
168
        On exit, LINDXAU  is an array of dimension 2*N. On exit, this
169
        array contains  the local destination  information encoded as
170
        follows.  If LINDXAU(k) >= 0, row  LINDXA(k)  of A  is  to be
171
        copied in U at position LINDXAU(k).  Otherwise, row LINDXA(k)
172
        of A should be locally copied into A(-LINDXAU(k),:).
173
</PRE>
174
<PRE>
175
LLEN    (global output)               int *
176
        On entry,  LLEN  is  an array  of length  NPROW.  On exit, it
177
        contains how many rows every process has.
178
</PRE>
179

    
180
<H1>See Also</H1>
181
<A HREF="HPL_pdlaswp00N.html">HPL_pdlaswp00N</A>,
182
<A HREF="HPL_pdlaswp00T.html">HPL_pdlaswp00T</A>,
183
<A HREF="HPL_pdlaswp01N.html">HPL_pdlaswp01N</A>,
184
<A HREF="HPL_pdlaswp01T.html">HPL_pdlaswp01T</A>.
185

    
186
</BODY>
187
</HTML>