Publication Type
Journal Article
Version
acceptedVersion
Publication Date
12-2022
Abstract
Artificial intelligence systems, such as Sentiment Analysis (SA) systems, typically learn from large amounts of data that may reflect human bias. Consequently, such systems may exhibit unintended demographic bias against specific characteristics (e.g., gender, occupation, country-of-origin, etc.). Such bias manifests in an SA system when it predicts different sentiments for similar texts that differ only in the characteristic of individuals described. To automatically uncover bias in SA systems, this paper presents BiasFinder, an approach that can discover biased predictions in SA systems via metamorphic testing. A key feature of BiasFinder is the automatic curation of suitable templates from any given text inputs, using various Natural Language Processing (NLP) techniques to identify words that describe demographic characteristics. Next, BiasFinder generates new texts from these templates by mutating words associated with a class of a characteristic (e.g., gender-specific words such as female names, “she”, “her”). These texts are then used to tease out bias in an SA system. BiasFinder identifies a bias-uncovering test case (BTC) when an SA system predicts different sentiments for texts that differ only in words associated with a different class (e.g., male vs. female) of a target characteristic (e.g., gender). We evaluate BiasFinder on 10 SA systems and 2 large scale datasets, and the results show that BiasFinder can create more BTCs than two popular baselines. We also conduct an annotation study and find that human annotators consistently think that test cases generated by BiasFinder are more fluent than the two baselines.
Keywords
sentiment analysis, test case generation, metamorphic testing, bias, fairness bug
Discipline
Artificial Intelligence and Robotics | Software Engineering
Research Areas
Intelligent Systems and Optimization
Publication
IEEE Transactions on Software Engineering
Volume
48
Issue
12
First Page
5087
Last Page
5101
ISSN
0098-5589
Identifier
10.1109/TSE.2021.3136169
Publisher
Institute of Electrical and Electronics Engineers
Citation
ASYROFI, Muhammad Hilmi; YANG, Zhou; IMAM NUR BANI YUSUF; KANG, Hong Jin; Ferdian, Thung; and LO, David.
BiasFinder: Metamorphic test generation to uncover bias for sentiment analysis systems. (2022). IEEE Transactions on Software Engineering. 48, (12), 5087-5101.
Available at: https://ink.library.smu.edu.sg/sis_research/7611
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/TSE.2021.3136169