Gupta, Rajiv (1991) Compiler optimization of data storage. Dissertation (Ph.D.), California Institute of Technology. http://resolver.caltech.edu/CaltechETD:etd-06272007-081805
The system efficiency and throughput of most architectures are critically dependent on the ability of the memory subsystem to satisfy data operand accesses. This ability is in turn dependent on the distribution or layout of the data relative to the access of the data by the executing code. Page faults, cache misses, truncated vectors, global communication, for example, are expensive but common symptoms of data and access misalignment. Compiler optimization, traditionally synonymous with code optimization, has addressed the issue of efficient data access by manipulating the code to better access the data under a fixed, default distribution. This approach is restrictive, and often suboptimal. Data optimization, or data-layout optimization, is presented as an integral part of compiler optimization. For scalar data, a good compile-time approximation of the "reference string," or sequence of data accesses, is advanced for the purpose of distributing the data. However, the optimal distribution of the scalar data for such, or any, reference string is proved NP-complete. A methodology and a polynomial algorithm for an approximate solution are developed. Experiments with representative, but scaled, scientific programs and execution environments display a reduction in cache misses up to two orders in magnitude. For array data, compile-time predictions of the patterns in which the data is accessed by programs in scalar and array languages are examined. For arbitrary computations in an array language, the determination of the optimal layout of the data is proved to be NP-complete. Polynomial techniques for the approximate solutions to the optimal layout of arrays in both languages, scalar and array, are outlined. The general applicability of the techniques, in terms of environments other than hierarchical memories, and in terms of interdependence with code manipulations, is discussed. New code optimizations inspired by the data distribution techniques are motivated. The prudence of compiler- over user-optimized data distribution is argued.
|Item Type:||Thesis (Dissertation (Ph.D.))|
|Degree Grantor:||California Institute of Technology|
|Division:||Engineering and Applied Science|
|Major Option:||Computer Science|
|Thesis Availability:||Public (worldwide access)|
|Defense Date:||13 July 1990|
|Default Usage Policy:||No commercial reproduction, distribution, display or performance rights in this work are provided.|
|Deposited By:||Imported from ETD-db|
|Deposited On:||18 Jul 2007|
|Last Modified:||26 Dec 2012 02:54|
- Final Version
See Usage Policy.
Repository Staff Only: item control page