Nikolaos Kyrtatas, Daniele G. Spampinato and Markus Püschel (Proc. Design, Automation and Test in Europe (DATE), pp. 1054-1059, 2015)
A Basic Linear Algebra Compiler for Embedded Processors
Preprint (772 KB)
Published paper (link to publisher)

Many applications in signal processing, control, and graphics on embedded devices require efficient linear algebra computations. On general-purpose computers, program generators have proven useful to produce such code, or important building blocks, automatically. An example is LGen, a compiler for basic linear algebra computations of fixed size. In this work, we extend LGen towards the embedded domain using as example targets Intel Atom, ARM Cortex-A8, ARM Cortex-A9, and Raspberry Pi (ARM1176). To efficiently support these processors we introduce support for the NEON vector ISA and a methodology for domain-specific load/store optimizations. Our experimental evaluation shows that the new version of LGen produces code that performs better than well-established, commercial and non-commercial libraries (Intel MKL and IPP), software generators (Eigen and ATLAS), and compilers (icc, gcc, and clang).

(No keyword)