Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
1-2025
Abstract
Recently, the rise of code-centric large language models (LLMs) appears to have reshaped the software engineering world with low-barrier tools like Copilot that can generate code easily. However, there is no correctness guarantee for the code generated by LLMs, which suffer from the hallucination problem, and their output is fraught with risks. Besides, the end-to-end process from specification to code through LLMs is a non-transparent and uncontrolled black box. This opacity makes it difficult for users to understand and trust the generated code. Addressing these challenges is both necessary and critical. In contrast, program refinement transforms high-level specification statements into executable code while preserving correctness. Traditional tools for program refinement are primarily designed for formal methods experts and lack automation and extensibility. We apply program refinement to guide LLM and validate the LLM-generated code while transforming refinement into a more accessible and flexible framework. To initiate this vision, we propose Refine4LLM, an approach that aims to: (1) Formally refine the specifications, (2) Automatically prompt and guide the LLM using refinement calculus, (3) Interact with the LLM to generate the code and proof obligations, (4) Verify that the generated code satisfies the constraints, thus guaranteeing its correctness, (5) Learn and build more refinement laws to extend the refinement calculus. We evaluated Refine4LLM against the state-of-the-art baselines on program refinement and LLMs benchmarks. The experiment results show that Refine4LLM can efficiently generate more robust code and reduce the time for refinement and verification.
Keywords
Program Refinement, Large Language Model, Program Synthesis
Discipline
Software Engineering
Research Areas
Software and Cyber-Physical Systems
Areas of Excellence
Digital transformation
Publication
Proceedings of the 52nd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2025), Denver, Colorado, January 19-25
Volume
9
Issue
POPI
First Page
2057
Last Page
2089
Identifier
10.1145/3704905
Publisher
ACM
City or Country
New York
Citation
CAI, Yufan; HOU, Zhe; SANAN, David; LUAN, Xiaokun; LIN, Yun; SUN, Jun; and DONG, Jin Song.
Automated program refinement: Guide and verify code large language model with refinement calculus. (2025). Proceedings of the 52nd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2025), Denver, Colorado, January 19-25. 9, (POPI), 2057-2089.
Available at: https://ink.library.smu.edu.sg/sis_research/10665
Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1145/3704905