Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

7-2025

Abstract

Migrating existing C programs into Rust is increasingly desired, as Rust offers superior memory safety while maintaining C’s high performance. Existing automated translation tools, such as C2Rust, may rely too much on syntactic, template-based translation and generate unsafe Rust code that is hard for human developers to read, maintain, or even compile. More semantic-aware translation that produces safer, idiomatic, and runnable Rust code is much needed. This paper introduces a novel dependency-guided and large language model (LLM)-based C-to-Rust translation approach, RustMap, based on three key ideas: (1) Utilize LLM’s capabilities to produce idiomatic Rust code from given small pieces of C code, (2) Mitigate LLM’s incapability in handling large codebases by breaking project-scale C programs into smaller units for translation according to their usage dependencies, and (3) Enhance the correctness of the translated Rust program by utilizing available test cases to check input/output equivalence between C and Rust code, and iteratively utilizing the feedback from compilation and testing errors for LLM to refine translated Rust code. We have empirically evaluated RustMap on 126 sample real-world programs, including 125 programs from Rosetta Code and a complex bzip2 program containing more than 7000 lines of code, using GPT-4o as the LLM. RustMap shows promising results in guiding GPT-4o to translate most of the C code into more idiomatic, readable, and functional Rust code with significantly less unsafe code than other translation tools, presenting non-trivial translation patterns that may be reusable for future research.

Discipline

Artificial Intelligence and Robotics

Research Areas

Intelligent Systems and Optimization

Areas of Excellence

Digital transformation

Publication

Engineering of Complex Computer Systems: 29th International Conference, ICECCS 2025, Hangzhou, China, July 2-4

First Page

283

Last Page

302

Identifier

10.1007/978-3-032-00828-2_16

Publisher

ACM

City or Country

New York

Share

COinS