Statistiques
| Révision :

root / man / man3 / HPL_plindx0.3 @ 1

Historique | Voir | Annoter | Télécharger (6,78 ko)

1
.TH HPL_plindx0 3 "September 10, 2008" "HPL 2.0" "HPL Library Functions"
2
.SH NAME
3
HPL_plindx0 \- Compute local swapping index arrays.
4
.SH SYNOPSIS
5
\fB\&#include "hpl.h"\fR
6
 
7
\fB\&void\fR
8
\fB\&HPL_plindx0(\fR
9
\fB\&HPL_T_panel *\fR
10
\fI\&PANEL\fR,
11
\fB\&const int\fR
12
\fI\&K\fR,
13
\fB\&int *\fR
14
\fI\&IPID\fR,
15
\fB\&int *\fR
16
\fI\&LINDXA\fR,
17
\fB\&int *\fR
18
\fI\&LINDXAU\fR,
19
\fB\&int *\fR
20
\fI\&LLEN\fR
21
\fB\&);\fR
22
.SH DESCRIPTION
23
\fB\&HPL_plindx0\fR
24
computes two local arrays  LINDXA and  LINDXAU  containing
25
the  local  source and final destination position  resulting from the
26
application of row interchanges.
27
 
28
On entry, the array  IPID  of length K is such that the row of global
29
index  IPID(i)  should be mapped onto row of global index  IPID(i+1).
30
Let  IA  be the global index of the first row to be swapped. For k in
31
[0..K/2), the row of global index IPID(2*k) should be mapped onto the
32
row of global index  IPID(2*k+1).  The question then, is to determine
33
which rows should ultimately be part of U.
34
 
35
First, some rows of the process ICURROW  may be swapped locally.  One
36
of this row belongs to U, the other one belongs to my local  piece of
37
A.  The other  rows of the current block are swapped with remote rows
38
and are thus not part of U. These rows however should be sent  along,
39
and  grabbed by the other processes  as we  progress in the  exchange
40
phase.
41
 
42
So, assume that I am  ICURROW  and consider a row of index  IPID(2*i)
43
that I own. If I own IPID(2*i+1) as well and IPID(2*i+1) - IA is less
44
than N,  this row is locally swapped and should be copied into  U  at
45
the position IPID(2*i+1) - IA. No row will be exchanged for this one.
46
If IPID(2*i+1)-IA is greater than N, then the row IPID(2*i) should be
47
locally copied into my local piece of A at the position corresponding
48
to the row of global index IPID(2*i+1).
49
 
50
If the process  ICURROW does not own  IPID(2*i+1), then row IPID(2*i)
51
is to be swapped away and strictly speaking does not belong to U, but
52
to  A  remotely.  Since this  process will however send this array U,
53
this row is  copied into  U, exactly where the row IPID(2*i+1) should
54
go. For this, we search IPID for k1, such that IPID(2*k1) is equal to
55
IPID(2*i+1); and row  IPID(2*i) is to be copied in U  at the position
56
IPID(2*k1+1)-IA.
57
 
58
It is thus  important to put the rows that go into U, i.e., such that
59
IPID(2*i+1) - IA is less than N at the begining of the array IPID. By
60
doing so,  U  is formed, and the local copy  is performed in just one
61
sweep.
62
 
63
Two lists  LINDXA  and  LINDXAU are built.  LINDXA contains the local
64
index of the rows I have that should be copied. LINDXAU  contains the
65
local destination information: if LINDXAU(k) >= 0, row LINDXA(k) of A
66
is to be copied in U at position LINDXAU(k). Otherwise, row LINDXA(k)
67
of A should be locally copied into A(-LINDXAU(k),:).  In the  process
68
ICURROW, the initial packing algorithm proceeds as follows.
69
 
70
  for all entries in IPID,
71
     if IPID(2*i) is in ICURROW,
72
        if IPID(2*i+1) is in ICURROW,
73
           if( IPID(2*i+1) - IA < N )
74
            save corresponding local position
75
            of this row (LINDXA);
76
            save local position (LINDXAU) in U
77
            where this row goes;
78
            [copy row IPID(2*i) in U at position
79
            IPID(2*i+1)-IA; ];
80
           else
81
            save corresponding local position of
82
            this row (LINDXA);
83
            save local position (-LINDXAU) in A
84
            where this row goes;
85
            [copy row IPID(2*i) in my piece of A
86
            at IPID(2*i+1);]
87
           end if
88
        else
89
           find k1 such that IPID(2*k1) = IPID(2*i+1);
90
           copy row IPID(2*i) in U at position
91
           IPID(2*k1+1)-IA;
92
           save corresponding local position of this
93
           row (LINDXA);
94
           save local position (LINDXAU) in U where
95
           this row goes;
96
        end if
97
     end if
98
  end for
99
 
100
Second, if I am not the current row process  ICURROW, all source rows
101
in IPID that I own are part of U. Indeed,  they  are swapped with one
102
row  of  the  current  block  of rows,  and  the  main  factorization
103
algorithm proceeds one row after each other.  The processes different
104
from ICURROW,  should  exchange and accumulate  those rows until they
105
receive some data previously owned by the process ICURROW.
106
 
107
In processes different from  ICURROW,  the  initial packing algorithm
108
proceeds as follows.  Consider a row of global index IPID(2*i) that I
109
own. When I will be receiving data previously owned by ICURROW, i.e.,
110
U, row IPID(2*i) should  replace the row in U at pos. IPID(2*i+1)-IA,
111
and  this particular row of U should be first copied into my piece of
112
A, at A(il,:),  where  il is the  local row  index  corresponding  to
113
IPID(2*i). Now,initially, this row will be packed into workspace, say
114
as the kth row of  that  work array.  The  following  algorithm  sets
115
LINDXAU[k] to IPID(2*i+1)-IA, that is the position in U where the row
116
should be copied. LINDXA(k) stores the local index in  A  where  this
117
row of U should be copied, i.e il.
118
 
119
  for all entries in IPID,
120
     if IPID(2*i) is not in ICURROW,
121
        copy row IPID(2*i) in work array;
122
        save corresponding local position
123
        of this row (LINDXA);
124
        save position (LINDXAU) in U where
125
        this row should be copied;
126
     end if
127
  end for
128
 
129
Since we are at it, we also globally figure  out  how many rows every
130
process has. That is necessary, because it would rather be cumbersome
131
to  figure it on  the fly  during the  bi-directional exchange phase.
132
This information is kept in the array  LLEN  of size NPROW. Also note
133
that the arrays LINDXA and LINDXAU are of max length equal to 2*N.
134
.SH ARGUMENTS
135
.TP 8
136
PANEL   (local input/output)    HPL_T_panel *
137
On entry,  PANEL  points to the data structure containing the
138
panel information.
139
.TP 8
140
K       (global input)          const int
141
On entry, K specifies the number of entries in IPID.  K is at
142
least 2*N, and at most 4*N.
143
.TP 8
144
IPID    (global input)          int *
145
On entry,  IPID  is an array of length K. The first K entries
146
of that array contain the src and final destination resulting
147
from the application of the interchanges.
148
.TP 8
149
LINDXA  (local output)          int *
150
On entry, LINDXA  is an array of dimension 2*N. On exit, this
151
array contains the local indexes of the rows of A I have that
152
should be copied into U.
153
.TP 8
154
LINDXAU (local output)          int *
155
On exit, LINDXAU  is an array of dimension 2*N. On exit, this
156
array contains  the local destination  information encoded as
157
follows.  If LINDXAU(k) >= 0, row  LINDXA(k)  of A  is  to be
158
copied in U at position LINDXAU(k).  Otherwise, row LINDXA(k)
159
of A should be locally copied into A(-LINDXAU(k),:).
160
.TP 8
161
LLEN    (global output)         int *
162
On entry,  LLEN  is  an array  of length  NPROW.  On exit, it
163
contains how many rows every process has.
164
.SH SEE ALSO
165
.BR HPL_pdlaswp00N \ (3),
166
.BR HPL_pdlaswp00T \ (3),
167
.BR HPL_pdlaswp01N \ (3),
168
.BR HPL_pdlaswp01T \ (3).