Cart
Free Shipping in the UK
Proud to be B-Corp

Compiling Parallel Loops for High Performance Computers David E. Hudak

Compiling Parallel Loops for High Performance Computers By David E. Hudak

Compiling Parallel Loops for High Performance Computers by David E. Hudak


Summary

2 Code Segments . . . . . . . . . . . . . . . 3 Determining Communication Parameters . 5 Partitioning . . . . . . 6 Experimental Results . . . . . . . . . . . . . . . . 3 The CPR Algorithm . . . . 149 INDEX . . . . . . . . . . . . . . . . . 2 Example of an iterative data-parallel loop . . . . . . . . . . . . . . . . . .

Compiling Parallel Loops for High Performance Computers Summary

Compiling Parallel Loops for High Performance Computers: Partitioning, Data Assignment and Remapping by David E. Hudak

4. 2 Code Segments . . . . . . . . . . . . . . . 96 4. 3 Determining Communication Parameters . 99 4. 4 Multicast Communication Overhead · 103 4. 5 Partitioning . . . . . . · 103 4. 6 Experimental Results . 117 4. 7 Conclusion. . . . . . . · 121 5 COLLECTIVE PARTITIONING AND REMAPPING FOR MULTIPLE LOOP NESTS 125 5. 1 Introduction. . . . . . . . . 125 5. 2 Program Enclosure Trees. . 128 5. 3 The CPR Algorithm . . 132 5. 4 Experimental Results. . 141 5. 5 Conclusion. . 146 BIBLIOGRAPHY. 149 INDEX . . . . . . . . 157 LIST OF FIGURES Figure 1. 1 The Butterfly Architecture. . . . . . . . . . 5 1. 2 Example of an iterative data-parallel loop . . 7 1. 3 Contiguous tiling and assignment of an iteration space. 13 2. 1 Communication along a line segment. . . 24 2. 2 Access pattern for the access offset, (3,2). 25 2. 3 Decomposing an access vector along an orthogonal basis set of vectors. . . . . . . . . . . . . . . . . . . 26 2. 4 An analysis of communication patterns. 29 2. 5 Decomposing a vector along two separate basis sets of vectors. 31 2. 6 Cache lines aligning with borders. 33 2. 7 Cache lines not aligned with borders. 34 2. 8 nh is the difference of nd and nb. 42 2. 9 nh is the sum of nd and nb. 42 2. 10 The ADAPT system. 44 2. 11 Code segment used in experiments. . 46 2. 12 Execution rates for various partitions. 47 2. 13 Execution time of partitions on Multimax. 48 2. 14 Performance increase as processing power increases. 49 2. 15 Percentage miss ratios for various aspect ratios and line sizes.

Table of Contents

List of Figures. List of Tables. Preface. 1. Introduction. 2. Contiguous Loop Partitions for Neighborhood Communication. 3. Contiguous Data Assignments for Neighborhood Communication. 4. Cyclic Loop Partitions for Linearly Varying Loops. 5. Collective Partitioning and Remapping for Multiple Loop Nests. Bibliography. Index.

Additional information

NPB9780792392835
9780792392835
0792392833
Compiling Parallel Loops for High Performance Computers: Partitioning, Data Assignment and Remapping by David E. Hudak
New
Hardback
Springer
1992-10-31
159
N/A
Book picture is for illustrative purposes only, actual binding, cover or edition may vary.
This is a new book - be the first to read this copy. With untouched pages and a perfect binding, your brand new copy is ready to be opened for the first time

Customer Reviews - Compiling Parallel Loops for High Performance Computers