Publication Type

Conference Proceeding Article

Version

submittedVersion

Publication Date

6-2011

Abstract

It is often very expensive and practically infeasible to generate test cases that can exercise all possible program states in a program. This is especially true for a medium or large industrial system. In practice, industrial clients of the system often have a set of input data collected either before the system is built or after the deployment of a previous version of the system. Such data are highly valuable as they represent the operations that matter in a client's daily business and may be used to extensively test the system. However, such data often carries sensitive information and cannot be released to third-party development houses. For example, a healthcare provider may have a set of patient records that are strictly confidential and cannot be used by any third party. Simply masking sensitive values alone may not be sufficient, as the correlation among fields in the data can reveal the masked information. Also, masked data may exhibit different behavior in the system and become less useful than the original data for testing and debugging.For the purpose of releasing private data for testing and debugging, this paper proposes the kb-anonymity model, which combines the k-anonymity model commonly used in the data mining and database areas with the concept of program behavior preservation. Like k-anonymity, kb-anonymity replaces some information in the original data to ensure privacy preservation so that the replaced data can be released to third-party developers. Unlike k-anonymity, kb-anonymity ensures that the replaced data exhibits the same kind of program behavior exhibited by the original data so that the replaced data may still be useful for the purposes of testing and debugging. We also provide a concrete version of the model under three particular configurations and have successfully applied our prototype implementation to three open source programs, demonstrating the utility and scalability of our prototype.

Keywords

k-anonymity, symbolic execution, third-party testing and debugging, behavior preservation

Discipline

Software Engineering

Research Areas

Software and Cyber-Physical Systems

Publication

PLDI 11: Proceedings of the 2011 ACM Conference on Programming Language Design and Implementation, San Jose, CA, June 4-8, 2011

First Page

447

Last Page

457

ISBN

9781450306638

Identifier

10.1145/1993316.1993551

Publisher

ACM

City or Country

New York

Copyright Owner and License

Authors

Additional URL

http://doi.org/10.1145/1993316.1993551

Share

COinS