Efficient white-box fairness testing through gradient search

Lingfeng ZHANG
Yueling ZHANG, Singapore Management University
Min ZHANG

Abstract

Deep learning (DL) systems are increasingly deployed for autonomous decision-making in a wide range of applications. Apart from the robustness and safety, fairness is also an important property that a well-designed DL system should have. To evaluate and improve individual fairness of a model, systematic test case generation for identifying individual discriminatory instances in the input space is essential. In this paper, we propose a framework EIDIG for efficiently discovering individual fairness violation. Our technique combines a global generation phase for rapidly generating a set of diverse discriminatory seeds with a local generation phase for generating as many individual discriminatory instances as possible around these seeds under the guidance of the gradient of the model output. In each phase, prior information at successive iterations is fully exploited to accelerate convergence of iterative optimization or reduce frequency of gradient calculation. Our experimental results show that, on average, our approach EIDIG generates 19.11% more individual discriminatory instances with a speedup of 121.49% when compared with the state-of-the-art method and mitigates individual discrimination by 80.03% with a limited accuracy loss after retraining.