Integrated Software Fingerprinting via Neural-Network-Based Control Flow Obfuscation
Dynamic software fingerprinting has been an important tool in fighting against software theft and pirating by embedding unique fingerprints into software copies. However, existing work uses methods from dynamic software watermarking as direct solutions in which secret marks are inside rather independent code modules attached to the software. This results in an intrinsic weakness against targeted collusive attacks since differences among software copies correspond directly to the fingerprint-related components. In this paper, we suggest a novel mode of dynamic fingerprinting called integrated fingerprinting, of which the goal is to ensure all fingerprinted software copies possess identical behaviors at semantic level. We then provide the first implementation of integrated fingerprinting called Neuroprint on top of a control flow obfuscator that replaces program’s conditional structures with neural networks trained to simulate their branching behaviors . Leveraging the rich entropy in the outputs of these neural networks, Neuroprint embeds software fingerprints such that a one-time construction of the networks serves both purposes of obfuscation and fingerprinting. Evaluations show that due to the incomprehensibility of neural networks, it is infeasible to de-obfuscate the software transformed by Neuroprint or attack the fingerprint using even the latest program analysis techniques. Revealing information regarding the hidden fingerprints via collusive attacks on Neuroprint is difficult as well. Finally, Neuroprint also demonstrates negligible runtime overhead.