Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

1-2025

Abstract

Recently, the rise of code-centric large language models (LLMs) appears to have reshaped the software engineering world with low-barrier tools like Copilot that can generate code easily. However, there is no correctness guarantee for the code generated by LLMs, which suffer from the hallucination problem, and their output is fraught with risks. Besides, the end-to-end process from specification to code through LLMs is a non-transparent and uncontrolled black box. This opacity makes it difficult for users to understand and trust the generated code. Addressing these challenges is both necessary and critical. In contrast, program refinement transforms high-level specification statements into executable code while preserving correctness. Traditional tools for program refinement are primarily designed for formal methods experts and lack automation and extensibility. We apply program refinement to guide LLM and validate the LLM-generated code while transforming refinement into a more accessible and flexible framework. To initiate this vision, we propose Refine4LLM, an approach that aims to: (1) Formally refine the specifications, (2) Automatically prompt and guide the LLM using refinement calculus, (3) Interact with the LLM to generate the code and proof obligations, (4) Verify that the generated code satisfies the constraints, thus guaranteeing its correctness, (5) Learn and build more refinement laws to extend the refinement calculus. We evaluated Refine4LLM against the state-of-the-art baselines on program refinement and LLMs benchmarks. The experiment results show that Refine4LLM can efficiently generate more robust code and reduce the time for refinement and verification.

Keywords

Program Refinement, Large Language Model, Program Synthesis

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Areas of Excellence

Digital transformation

Publication

Proceedings of the 52nd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2025), Denver, Colorado, January 19-25

Volume

Issue

POPI

First Page

2057

Last Page

2089

Identifier

10.1145/3704905

Publisher

ACM

City or Country

New York

Citation

CAI, Yufan; HOU, Zhe; SANAN, David; LUAN, Xiaokun; LIN, Yun; SUN, Jun; and DONG, Jin Song. Automated program refinement: Guide and verify code large language model with refinement calculus. (2025). Proceedings of the 52nd ACM SIGPLAN Symposium on Principles of Programming Languages (POPL 2025), Denver, Colorado, January 19-25. 9, (POPI), 2057-2089.
Available at: https://ink.library.smu.edu.sg/sis_research/10665

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

https://doi.org/10.1145/3704905

Download

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Automated program refinement: Guide and verify code large language model with refinement calculus

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

Volume

Issue

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Automated program refinement: Guide and verify code large language model with refinement calculus

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

Volume

Issue

First Page

Last Page

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links