Statistiques
| Révision :

root / www / references.html @ 9

Historique | Voir | Annoter | Télécharger (11,61 ko)

1
<HTML>
2
<HEAD>
3
<TITLE>HPL References</TITLE>
4
</HEAD>
5

    
6
<BODY 
7
BGCOLOR     = "WHITE"
8
BACKGROUND  = "WHITE"
9
TEXT        = "#000000"
10
VLINK       = "#000099"
11
ALINK       = "#947153"
12
LINK        = "#0000ff">
13

    
14
<H2>HPL References</H2>
15

    
16
<STRONG>
17
The list of references below contains some relevant published material
18
to this work.  This list  is  provided  for illustrative purposes, and
19
should be regarded  as an initial  starting point  for the  interested
20
reader. This list is by all means not meant to be exhaustive.
21
</STRONG><BR><BR>
22

    
23
The references have been sorted in four categories and chronologically
24
listed within each category. The four categories are
25
<UL>
26
<LI><A HREF="references.html#Linpack_Benchmark">Linpack Benchmark</A>
27
<LI><A HREF="references.html#parallel_LUfact">Parallel  LU Factorization</A>
28
<LI><A HREF="references.html#recursiv_LUfact">Recursive LU Factorization</A>
29
<LI><A HREF="references.html#parallel_matmul">Parallel Matrix Multiply</A>
30
<LI><A HREF="references.html#parallel_trsolv">Parallel Triangular Solve</A>
31
</UL>
32
<HR NOSHADE
33

    
34
<H3<A ="Linpack_Benchmark">Linpack Benchmark</A></H3>
35

    
36
<UL>
37

    
38
<! - 1979 ----------------------------------------------------------- !>
39
<LI><I>LINPACK Users Guide</I>, J. Dongarra, J. Bunch, C. Moler and
40
G. W. Stewart, SIAM, Philadelphia, PA, 1979.
41

    
42
<! - 1989 ----------------------------------------------------------- !>
43
<LI><I>Performance of Various Computers Using Standard Linear Equations
44
Software</I>, J. Dongarra, Technical Report CS-89-85, University of 
45
Tennessee, 1989. (An updated version of this report can be found at
46
<A HREF="http://www.netlib.org/benchmark/performance.ps">
47
http://www.netlib.org/benchmark/performance.ps</A>).
48

    
49
<! - 1991 ----------------------------------------------------------- !>
50
<LI><I>Towards Peak Parallel LINPACK Performance on 400</I>,
51
R. Bisseling and L. Loyens, Supercomputer, Vol. 45, pp. 20-27, 1991.
52

    
53
<LI><I>Massively Parallel LINPACK Benchmark on the Intel Touchstone 
54
DELTA and iPSC/860 Systems</I>, R. van de Geijn, 1991 Annual Users
55
Conference Proceedings. Intel Supercomputer Users Group, Dallas, TX,
56
1991.
57

    
58
<LI><I>The LINPACK Benchmark on the AP 1000</I>, R. Brent, Frontiers,
59
1992, pp. 128-135, McLean, VA, 1992.
60

    
61
<! - 1993 ----------------------------------------------------------- !>
62
<LI><I>Implementation of BLAS Level 3 and LINPACK Benchmark on the
63
AP1000</I>, R. Brent and P. Strazdins, Fujitsu Scientific and Technical
64
Journal, Vol. 5, No. 1, pp. 61-70, 1993.
65

    
66
<! - 1994 ----------------------------------------------------------- !>
67
<LI><I>LU Factorization and the LINPACK Benchmark on the Intel
68
Paragon</I>, D. Womble, D. Greenberg, D. Wheat and S. Riesen, Sandia
69
Technical Report, 1994.
70

    
71
<! - 1995 ----------------------------------------------------------- !>
72
<LI><I>Massively Parallel Distributed Computing: Worlds First 281
73
Gigaflop Supercomputer</I>, J. Bolen, A. Davis, B. Dazey, S. Gupta,
74
G. Henry, D. Robboy, G. Schiffler, D. Scott, M. Stallcup, A. Taraghi,
75
S. Wheat from Intel SSD, L. Fisk, G. Istrail, C. Jong, R. Riesen,
76
L. Shuler, from Sandia National Laboratories, Proceedings of the Intel
77
Supercomputer Users Group 1995.
78

    
79
<! - 1997 ----------------------------------------------------------- !>
80
<LI><I>High Performance Software on Intel Pentium Pro Processors or
81
Micro-Ops to TeraFLOPS</I>, B. Greer and G. Henry, Proceedings of the
82
SuperComputing 1997 Conference, ACM SIGARCH - IEEE Computer Society
83
Press - ISBN: 0-89791-985-8, San Jose, CA, 1997.
84

    
85
</UL>
86
<! ------------------------------------------------------------------ !>
87
<HR NOSHADE
88

    
89
<H3<A ="parallel_LUfact">Parallel LU Factorization</A></H3>
90

    
91
<UL>
92

    
93
<! - 1986 ----------------------------------------------------------- !>
94
<LI><I>Communication Complexity of the Gaussian Elimination Algorithm
95
on Multiprocessors</I>, Y. Saad, Linear Algebra and Its Applications,
96
Vol. 77, pp. 315-340, 1986.
97

    
98
<! - 1988 ----------------------------------------------------------- !>
99
<LI><I>LU Factorization Algorithms on Distributed-Memory Multiprocessor
100
Architectures</I>, G. Geist and C. Romine, SIAM Journal on Scientific
101
and Statistical Computing, Vol. 9, pp. 639-649, 1988.
102
 
103
<! - 1989 ----------------------------------------------------------- !>
104
<LI><I>Parallel LU Decomposition on a Transputer Network</I>, 
105
R. Bisseling and J. van der Vorst, Lecture Notes in Computer Sciences,
106
Springer-Verlag, Eds. G. van Zee and J. van der Vorst, Vol. 384,
107
pp. 61-77, 1989.
108

    
109
<! - 1990 ----------------------------------------------------------- !>
110
<LI><I>The Distributed Solution of Linear Systems Using the Torus-Wrap
111
Data Mapping</I>, C. Ashcraft, ECA-TR-147, Boeing Computer Services,
112
Seattle, WA, 1990.
113

    
114
<LI><I>Experiments with Multicomputer LU-Decomposition</I>, E. van de
115
Velde, Concurrency: Practice and Experience, Vol. 2, pp. 1-26, 1990.
116

    
117
<! - 1991 ----------------------------------------------------------- !>
118
<LI><I>A Taxonomy of Distributed Dense LU Factorization Methods</I>,
119
C. Ashcraft, ECA-TR-161, Boeing Computer Services, Seattle, WA, 1991.
120

    
121
<! - 1994 ----------------------------------------------------------- !>
122
<LI><I>The Torus-Wrap Mapping for Dense Matrix Calculations on Massively
123
Parallel Computers</I>, B. Hendrickson and D. Womble, SIAM Journal on
124
Scientific and Statistical Computing, Vol. 15, pp. 1201-1226, 1994.
125

    
126
<LI><I>Scalability Issues in the Design of a Library for Dense Linear
127
Algebra</I>, J. Dongarra, R. van de Geijn and D. Walker, Journal of
128
Parallel and Distributed Computing, Vol. 22, No. 3, pp. 523-537, 1994.
129

    
130
<! - 1995 ----------------------------------------------------------- !>
131
<LI><I>Matrix Factorization using Distributed Panels on the Fujitsu
132
AP1000</I>, P. Strazdins, Proceedings of the IEEE First International
133
Conference on Algorithms And Architectures for Parallel Processing
134
ICA3PP-95, Brisbane, 1995.
135

    
136
<! - 1996 ----------------------------------------------------------- !>
137
<LI><I>The Design and Implementation of the ScaLAPACK LU, QR, and
138
Cholesky Factorization Routines</I>, J. Choi, J. Dongarra, S. Ostrouchov,
139
A. Petitet, D. Walker and R. C. Whaley, Scientific Programming, Vol. 5,
140
pp. 173-184, 1996.
141

    
142
</UL>
143
<! ------------------------------------------------------------------ !>
144
<HR NOSHADE
145

    
146
<H3<A ="recursiv_LUfact">Recursive LU Factorization</A></H3>
147

    
148
<UL>
149

    
150
<! - 1997 ----------------------------------------------------------- !>
151
<LI><I>Locality of Reference in LU Decomposition with partial
152
pivoting</I>, S. Toledo, SIAM Journal on Matrix. Anal. Appl., Vol. 18,
153
No. 4, 1997.
154

    
155
<LI><I>Recursion Leads to Automatic Variable Blocking for Dense 
156
Linear-Algebra Algorithms</I>, F. Gustavson, IBM Journal of Research
157
and Development, Vol. 41, No. 6, pp. 737-755, 1997
158

    
159
</UL>
160
<! ------------------------------------------------------------------ !>
161
<HR NOSHADE
162

    
163
<H3<A ="parallel_matmul">Parallel Matrix Multiply</A></H3>
164

    
165
<UL>
166

    
167
<! - 1990 ----------------------------------------------------------- !>
168
<LI><I>Matrix Algorithms on a Hypercube I: Matrix Multiplication</I>,
169
G. Fox, S. Otto and A. Hey, Parallel Computing, Vol. 3, pp. 17-31, 1987.
170

    
171
<! - 1990 ----------------------------------------------------------- !>
172
<LI><I>Basic Matrix Subprograms for Distributed-Memory Systems</I>,
173
A. Elster, Proceedings of the Fifth Distributed-Memory Computing
174
Conference, Eds. D. Walker and Q. Stout, IEEE Press, pp. 311-316, 1990.
175
 
176
<! - 1991 ----------------------------------------------------------- !>
177
<LI><I>The Parallelization of Level 2 and 3 BLAS Operations on
178
Distributed-Memory Machines</I>, M. Aboelaze, N. Chrisochoides
179
and E. Houstis, CSD-TR-91-007, Purdue University, West Lafayette,
180
IN, 1991.
181

    
182
<! - 1992 ----------------------------------------------------------- !>
183
<LI><I>The Multicomputer Toolbox Approach to Concurrent BLAS and LACS</I>,
184
R. Falgout, A. Skjellum, S. Smith and C. Still, Proceedings of the
185
Scalable High Performance Computing Conference SHPCC-92, IEEE Computer
186
Society Press, 1992.
187

    
188
<! - 1994 ----------------------------------------------------------- !>
189
<LI><I>A High Performance Matrix Multiplication Algorithm on a
190
Distributed-Memory Parallel Computer, Using Overlapped Communication</I>,
191
R. Agarwal, F. Gustavson and M. Zubair, IBM Journal or Research and
192
Development, Vol. 38, No. 6, pp. 673-681, 1994.
193

    
194
<LI><I>PUMMA: Parallel Universal Matrix Multiplication Algorithms on
195
Distributed-Memory Concurrent Computers</I>, J. Choi, J. Dongarra and
196
D. Walker, Concurrency: Practice and Experience, Vol. 6, No. 7,
197
pp. 543-570, 1994.
198

    
199
<LI><I>Matrix Multiplication on the Intel Touchstone DELTA</I>,
200
S. Huss-Lederman, E. Jacobson, A. Tsao and G. Zhang, Concurrency:
201
Practice and Experience, Vol. 6, No. 7, pp. 571-594, 1994.
202
 
203
<! - 1995 ----------------------------------------------------------- !>
204
<LI><I>A Three-Dimensional Approach to Parallel Matrix Multiplication</I>,
205
R. Agarwal, S. Balle, F. Gustavson, M. Joshi and P. Palkar, IBM Journal
206
or Research and Development, Vol. 39, No. 5, pp. 575-582, 1995.
207

    
208
<! - 1995 ----------------------------------------------------------- !>
209
<LI><I>A High Performance Parallel Strassen Implementation</I>,
210
B. Grayson and R. van de Geijn, Parallel Processing Letters, Vol. 6,
211
No. 1, pp. 3-12, 1996.
212

    
213
<! - 1997 ----------------------------------------------------------- !>
214
<LI><I>Parallel Implementation of BLAS: General Techniques for Level
215
3 BLAS</I>, A. Chtchelkanova, J. Gunnels, G. Morrow, J. Overfelt and
216
R. van de Geijn, Concurrency: Practice and Experience, Vol. 9, No. 9,
217
pp. 837-857, 1997.
218

    
219
<LI><I>A Poly-Algorithm for Parallel Dense Matrix Multiplication on
220
Two-Dimensional Process Grid Topologies</I>, J. Li, R. Falgout and
221
A. Skjellum, Concurrency: Practice and Experience, Vol. 9, No. 5,
222
pp. 345-389, 1997.
223

    
224
<LI><I>SUMMA: Scalable Universal Matrix Multiplication Algorithm</I>,
225
R. van de Geijn and J. Watts, Concurrency: Practice and Experience,
226
Vol. 9, No. 4, pp. 255-274, 1997.
227

    
228
</UL>
229
<! ------------------------------------------------------------------ !>
230
<HR NOSHADE
231

    
232
<H3<A ="parallel_trsolv">Parallel Triangular Solve</A></H3>
233

    
234
<UL>
235
 
236
<! - 1988 ----------------------------------------------------------- !>
237
<LI><I>Parallel Solution Triangular Systems on Distributed-Memory
238
Multiprocessors</I>, M. Heath and C. Romine, SIAM Journal on Scientific
239
and Statistical Computing, Vol. 9, pp. 558-588, 1988.
240

    
241
<LI><I>A Parallel Triangular Solver for a Distributed-Memory
242
Multiprocessor</I>, G. Li and T. Coleman, SIAM Journal on Scientific
243
and Statistical Computing, Vol. 9, No. 3, pp. 485-502, 1988.
244

    
245
<! - 1989 ----------------------------------------------------------- !>
246
<LI><I>A New Method for Solving Triangular Systems on Distributed-Memory
247
Message-Passing Multiprocessor</I>, G. Li and T. Coleman, SIAM Journal
248
on Scientific and Statistical Computing, Vol. 10, No. 2, pp. 382-396,
249
1989.
250

    
251
<! - 1991 ----------------------------------------------------------- !>
252
<LI><I>Parallel Triangular System Solving on a Mesh Network of
253
Transputers</I>, R. Bisseling and J. van der Vorst, SIAM Journal
254
on Scientific and Statistical Computing, Vol. 12, pp. 787-799, 1991.
255

    
256
</UL>
257
<! ------------------------------------------------------------------ !>
258

    
259
<HR NOSHADE
260
<CENTER
261
<A  = "index.html">            [Home]</A>
262
<A HREF = "copyright.html">        [Copyright and Licensing Terms]</A>
263
<A HREF = "algorithm.html">        [Algorithm]</A>
264
<A HREF = "scalability.html">      [Scalability]</A>
265
<A HREF = "results.html">          [Performance Results]</A>
266
<A HREF = "documentation.html">    [Documentation]</A>
267
<A HREF = "software.html">         [Software]</A>
268
<A HREF = "faqs.html">             [FAQs]</A>
269
<A HREF = "tuning.html">           [Tuning]</A>
270
<A HREF = "errata.html">           [Errata-Bugs]</A>
271
<A HREF = "references.html">       [References]</A>
272
<A HREF = "links.html">            [Related Links]</A><BR>
273
</CENTER>
274
<HR NOSHADE
275
</BODY
276
</HTML