Research Collection School Of Computing and Information Systems

Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches

Lin Sze KHOO
Jia Qi BAY
Ming Lee Kimberly YAP
Mei Kuan LIM
Chun Yong CHONG
Zhou YANG
David LO, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

3-2023

Abstract

With the advancement of sentiment analysis (SA) models and their incorporation into our daily lives, fairness testing on these models is crucial, since unfair decisions can cause discrimination to a large population. Nevertheless, some challenges in fairness testing include the unknown oracle, the difficulty in generating suitable test inputs, and the lack of a reliable way of fixing the issues. To fill in these gaps, BiasRV, a tool based on metamorphic testing (MT), was introduced and succeeded in uncovering fairness issues in a transformer-based model. However, the extent of unfairness in other SA models has not been thoroughly investigated. Our work conducts a more comprehensive empirical study to reveal the extent of fairness violations, specifically gender fairness, exhibited by other popular word embedding-based SA models. We define fairness violation as the behavior in which an SA model predicts variants created from a text, which merely differ in gender classes, to have different sentiments. Our inspection utilizing BiasRV uncovers at least 30 fairness violations (at BiasRV's default threshold) in all three SA models. Realizing the importance of addressing such significant violations, we introduce adversarial patches (AP) as a way of patch generation in an automated program repair (APR) system to fix them. We adopt adversarial fine-tuning in AP by retraining SA models using adversarial examples, which are bias-uncovering test cases dynamically generated by a tool named BiasFinder at runtime. Evaluation of the SA models shows that our proposed AP reduces fairness violations by at least 25%.

Keywords

Fairness testing, Automated program repair, Sentiment analysis

Discipline

Software Engineering

Research Areas

Data Science and Engineering

Publication

Proceedings of the 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023, Macao, March 21-24

First Page

651

ISBN

9781665452786

Identifier

10.1109/SANER56733.2023.00066

Publisher

Institute of Electrical and Electronics Engineers

City or Country

New Jersey, USA

Citation

KHOO, Lin Sze; BAY, Jia Qi; YAP, Ming Lee Kimberly; LIM, Mei Kuan; CHONG, Chun Yong; YANG, Zhou; and LO, David. Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches. (2023). Proceedings of the 2023 IEEE International Conference on Software Analysis, Evolution and Reengineering, SANER 2023, Macao, March 21-24. 651.
Available at: https://ink.library.smu.edu.sg/sis_research/8514

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Download

Included in

Software Engineering Commons

COinS

Research Collection School Of Computing and Information Systems

Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Exploring and repairing gender fairness violations in word embedding-based sentiment analysis model through adversarial patches

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Publication

First Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Included in

Share

Search

Links

Browse

Links