OptSE: Towards optimal symbolic execution

Publication Type

Journal Article

Publication Date

7-2025

Abstract

Symbolic execution is a powerful technique that can accurately synthesize program inputs for program testing. However, the scalability of symbolic execution is often limited by the capability of the constraint solver and time for testing. With limited time budget, it is desirable to optimally select paths for symbolic execution and furthermore variables for symbolization in order to achieve the maximum code coverage. In this work, we make two technical contributions towards solving this problem. First, different from most existing solving strategies based on heuristic path selection, we formally define the ‘optimal’ strategy based on the reward of executing a given program path considering both possible code coverage and the cost of constraint solving. We further prove that the problem of identifying the optimal strategy for symbolic execution can be reduced to a classic knapsack problem, whose decision problem form is NP-complete. Second, in view of the complexity in identifying the optimal strategy, we design a practical greedy algorithm, named OptSE, for approximating the optimal strategy. We implemented OptSE in KLEE and extensively evaluate it on a diverse set of programs. The results show that OptSE is effective, i.e., achieving 12% more code coverage and detects 17% more security violations than the state-of-the-art symbolic execution tool and outperforming a collection of strategies that only consider either path selection, solving strategies or simply superimpose them.

Keywords

Testing, Costs, Heuristic Algorithms, Engines, Training, Security, Codes, Greedy Algorithms, Complexity Theory, Classification Algorithms, Symbolic Execution, SMT Solving, Deep Learning, Heuristic, Optimal Selection, Time Budget, Resolution Strategies, Path Selection, Constraint Satisfaction Problem, Knapsack Problem, Deep Learning, Use Of Resources, Estimated Values, Random Selection, Intermediate State, Transition Probabilities, Random Values, End Of The Program, Caching, Input Generation, Execution Efficiency, Program Coverage, End Of Path, Real World Programs, Partial Path, Control Flow Graph, Performance Of Solver, Execution Path, Concrete Values, Exploration Path

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

IEEE Transactions on Software Engineering

Volume

51

Issue

7

First Page

1934

Last Page

1949

ISSN

0098-5589

Identifier

10.1109/TSE.2025.3564666

Publisher

Institute of Electrical and Electronics Engineers

Additional URL

https://doi.org/10.1109/TSE.2025.3564666

This document is currently not available here.

Share

COinS