Copyrights to these papers may be held by the publishers. The download files are preprints. It is understood that all persons copying this information will adhere to the terms and constraints invoked by each author's copyright. These works may not be reposted without the explicit permission of the copyright holder.
Tom Henretty, Kevin Stock, Louis-NoŽl Pouchet, Franz Franchetti, J. Ramanujam and P. Sadayappan (Proc. International Conference on Compiler Construction (CC), 2011)
Data Layout Transformation for Stencil Computations on Short SIMD Architectures
Preprint (430 KB)
Published paper (link to publisher)
Stencil computations are at the core of applications in many domains such as computational electromagnetics, image processing, and partial differential equation solvers used in a variety of scientific and engineering applications. Short-vector SIMD instruction sets such as SSE and VMX provide a promising and widely available avenue for enhancing performance on modern processors. However a fundamental memory stream alignment issue limits achieved performance with stencil computations on modern short SIMD architectures. In this paper, we propose a novel data layout transformation that avoids the stream alignment conflict, along with a static analysis technique for determining where this transformation is applicable. Significant performance increases are demonstrated for a variety of stencil codes on several modern processors with SIMD capabilities.Keywords: SIMD vectorization, Architecture, Stencil computations