FPGA Acceleration of Structured-Mesh-Based Explicit and Implicit Numerical Solvers using SYCL


We explore the design and development of structured-mesh-based solvers on Intel FPGA hardware using the SYCL programming model. Two classes of applications are targeted : (1) stencil applications based on explicit numerical methods and (2) multi-dimensional tridiagonal solvers based on implicit methods. Both classes of solvers appear as core modules in a variety of real-world applications ranging from computational fluid dynamics to financial computing. A general, unified workflow is formulated for synthesizing these applications on Intel FPGAs together with predictive analytic models to explore the design space to obtain optimized performance. Performance of synthesized designs, using the above techniques, for two non-trivial applications on an Intel PAC D5005 FPGA card is benchmarked. Results are compared to the performance of optimized parallel implementations of the same applications on a Nvidia V100 GPU. Observed runtime results indicate the FPGA providing comparable or improved performance to the V100 GPU. However, more importantly the FPGA solutions consume 59%–76% less energy for their largest configurations. Our performance model predicts the runtime of designs with high accuracy with less than 5% error for all cases tested, demonstrating significant utility for design space exploration. With these tools and techniques, we discuss determinants for a given structured-mesh code to be amenable to FPGA implementation, providing insights into the feasibility and profitability FPGA implementation, how to code designs using SYCL, and the resulting performance.

In International Workshop on OpenCL (IWOCL)
Kamalavasan Kamalakkannan
Kamalavasan Kamalakkannan
Warwick PhD Alumnus

My research interests include reconfigurable and high performance computing.

Suhaib A. Fahmy
Suhaib A. Fahmy
Associate Professor of Computer Science

Suhaib is Principal Investigator of the Accelerated Connected Computing Lab (ACCL) at KAUST. His research explores hardware acceleration of complex algorithms and the integration of these accelerators within wider computing infrastructure.