Publication Type
PhD Dissertation
Version
publishedVersion
Publication Date
7-2024
Abstract
Despite anti-discrimination regulations mandating the provision of audio descriptions (ADs), the majority of online video content remains inaccessible to blind and low-vision (BLV) individuals. This is because these ADs are either absent or fail to adequately address the diverse and unique needs of the audience. Traditionally, content creators have relied on professionals to author ADs. However, this gold standard may not be accessible for some content creators because this method is still costly and has a long turnaround time. Moreover, when ADs are available, they tend to be static and unalterable, failing to cater to the unique preferences of BLV individuals and leading to a less personalized viewing experience. My dissertation addresses these challenges by designing, developing, and evaluating cost-effective and efficient methods for producing high-quality ADs. Furthermore, I also investigate the ADs customization desire and impact to meet individual preferences by developing an ADs customization interface. Through the design, development, and evaluation of these tools, my dissertation contributes to accessibility, human-computer interaction (HCI), and computer science, making video content experience more inclusive and personalized for BLV individuals.
My dissertation consists of three research threads. First, I developed ViScene, a web-based tool that enables collaborative ADs authoring, pairing sighted novices with either sighted or blind reviewers. A mixed-method study showed that novices, with reviews, can create descriptive, objective, and clear ADs at a significantly lower cost of professional services. The second research thread was to advance the fully manual authoring approach by incorporating automatic feedback mechanisms into the ADs authoring process, leveraging video scene recognition and natural language processing to enhance the quality of novice-authored scene descriptions (SD) without compromising on descriptiveness or clarity. This approach significantly reduced production costs and demonstrates the potential of automated systems in supporting ADs authoring. Lastly, further recognizing the diverse needs of BLV individuals, the third research thread explored the customization of ADs, introducing CustomAD, a prototype that allowed users to adjust various ADs properties like length, information emphasis, speed, voice, tone, gender and syntax. Through a mixed-method user study, my dissertation uncovered the desire for customization and demonstrated that customization notably enhances video understanding, immersion, and information navigation. The result highlighted the value of personalizing ADs.
This dissertation makes contributions across HCI, computer science, and accessibility by: 1) creating a quality assessment codebook for ADs review and evaluation; 2) developing ViScene for collaborative SD authoring; 3) providing empirical insights into mixed-ability collaboration; 4) offering design recommendations for future SD co-authoring interfaces; 5) introducing a human-machine collaboration interface for ADs authoring with real-time automated feedback; 6) evaluating the semi-automated SD authoring method; 7) outlining design implications for future ADs tools; 8) investigating BLV individuals’ customization needs for ADs; 9) designing CustomAD for ADs personalization; and 10) demonstrating the significant benefits of ADs customization, including enhanced video comprehension and immersion. Collectively, the contribution of this dissertation offers a new research direction for increasing the availability of ADs and supporting BLVs to have a more personalized experience with video content. This dissertation also provides a step forward in making videos more inclusive, offering practical solutions and design recommendations for future research and development of technologies in the video accessibility
Keywords
Accessibility, Blind and Low-vision Individuals, Individuals with Disabilities and Assistive Technologies, Video Accessibility, Audio Description, Customization, AI-supported Writing
Degree Awarded
PhD in Computer Science
Discipline
Graphics and Human Computer Interfaces
Supervisor(s)
HARA, Kotaro
First Page
1
Last Page
280
Publisher
Singapore Management University
City or Country
Singapore
Citation
NATALIE, Rosiana.
Creating and delivering audio descriptions for videos. (2024). 1-280.
Available at: https://ink.library.smu.edu.sg/etd_coll/623
Copyright Owner and License
Author