Q. Oschatz, N. Zhang, M. Franusich and Franz Franchetti (Proc. High Performance Extreme Computing (HPEC), 2025)
Towards Automated Reasoning Chains for Verification of LLM-Generated Scientific Code
Preprint (312 KB)
Bibtex

With the rise of Large Language Model (LLM) generated code, including in domains like scientific computing, ensuring not only syntactical, but also mathematical correctness, has become a critical task. Traditional formal methods approaches often struggle with the ambiguity of floating-point code, and full symbolic execution is extremely costly and limited. We propose a chain-of-reasoning approach that iteratively lifts basic semantics from code into the SPIRAL system and then establishes numerical equivalency to the desired mathematical operation. Here, we leverage the ample mathematical knowledge already formalized in SPIRAL to enable the system to recognize not just different implementations of the same algorithm but fully separate approaches to solving the given problem. The chain establishes tight error bounds on the output of given code with respect to the true continuous solution it approximates, quantifying all sources of error. We demonstrate this approach by establishing the correctness of a pseudospectral solver for a simple 1-dimensional Poisson problem.

Keywords:
SPIRAL, Scientific computing, Numerical analysis, Partial differential equations, Large language models