Publication Type
Conference Proceeding Article
Version
acceptedVersion
Publication Date
5-2023
Abstract
Implicit gender bias in software development is a well-documented issue, such as the association of technical roles with men. To address this bias, it is important to understand it in more detail. This study uses data mining techniques to investigate the extent to which 56 tasks related to software development, such as assigning GitHub issues and testing, are affected by implicit gender bias embedded in large language models. We systematically translated each task from English into a genderless language and back, and investigated the pronouns associated with each task. Based on translating each task 100 times in different permutations, we identify a significant disparity in the gendered pronoun associations with different tasks. Specifically, requirements elicitation was associated with the pronoun “he” in only 6% of cases, while testing was associated with “he” in 100% of cases. Additionally, tasks related to helping others had a 91% association with “he” while the same association for tasks related to asking coworkers was only 52%. These findings reveal a clear pattern of gender bias related to software development tasks and have important implications for addressing this issue both in the training of large language models and in broader society.
Keywords
gender bias, large language models, software engineering
Discipline
Programming Languages and Compilers | Software Engineering
Research Areas
Software and Cyber-Physical Systems
Publication
Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories, Melbourne, Australia 2023 May 15-16
First Page
624
Last Page
629
ISBN
9798350311846
Identifier
10.1109/MSR59073.2023.00088
Publisher
IEEE
City or Country
Melbourne, Australia
Citation
TREUDE, Christoph and HATA, Hideaki.
She elicits requirements and he tests: Software engineering gender bias in large language models. (2023). Proceedings of the 20th IEEE/ACM International Conference on Mining Software Repositories, Melbourne, Australia 2023 May 15-16. 624-629.
Available at: https://ink.library.smu.edu.sg/sis_research/8865
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.
Additional URL
https://doi.org/10.1109/MSR59073.2023.00088