Research Collection School Of Computing and Information Systems

Multimodal fashion knowledge extraction as captioning

Yifei YUAN, Chinese University of Hong Kong
Wenxuan ZHANG, Alibaba DAMO Academy
Yang DENG, Singapore Management UniversityFollow
Wai LAM, Chinese University of Hong Kong

Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

11-2023

Abstract

Social media plays a significant role in boosting the fashion industry, where a massive amount of fashion-related posts are generated every day. In order to obtain the rich fashion information from the posts, we study the task of social media fashion knowledge extraction. Fashion knowledge, which typically consists of the occasion, person attributes, and fashion item information, can be effectively represented as a set of tuples. Most previous studies on fashion knowledge extraction are based on the fashion product images without considering the rich text information in social media posts. Existing work on fashion knowledge extraction in social media is classification-based and requires to manually determine a set of fashion knowledge categories in advance. In our work, we propose to cast the task as a captioning problem to capture the interplay of the multimodal post information. Specifically, we transform the fashion knowledge tuples into a natural language caption with a sentence transformation method. Our framework then aims to generate the sentence-based fashion knowledge directly from the social media post. Inspired by the big success of pre-trained models, we build our model based on a multimodal pre-trained generative model and design several auxiliary tasks for enhancing the knowledge extraction. Since there is no existing dataset which can be directly borrowed to our task, we introduce a dataset consisting of social media posts with manual fashion knowledge annotation. Extensive experiments are conducted to demonstrate the effectiveness of our model.

Keywords

Fashion industry, Fashion knowledge extraction, Knowledge extraction, Multi-modal, Multi-modal data, Multimodal data mining, Product images, Rich texts, Social media, Social media analysis

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Areas of Excellence

Digital transformation

Publication

SIGIR-AP '23: Proceedings of the 11th International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, Beijing, November 23-28

First Page

Last Page

ISBN

9798400704086

Identifier

10.1145/3624918.3625315

Publisher

ACM

City or Country

New York

Citation

YUAN, Yifei; ZHANG, Wenxuan; DENG, Yang; and LAM, Wai. Multimodal fashion knowledge extraction as captioning. (2023). SIGIR-AP '23: Proceedings of the 11th International ACM SIGIR Conference on Research and Development in Information Retrieval in the Asia Pacific Region, Beijing, November 23-28. 52-62.
Available at: https://ink.library.smu.edu.sg/sis_research/9170

Copyright Owner and License

Authors

Creative Commons License

This work is licensed under a Creative Commons Attribution 4.0 International License.

Additional URL

https://doi.org/10.1145/3624918.3625315

Download

Included in

Databases and Information Systems Commons

COinS

Research Collection School Of Computing and Information Systems

Multimodal fashion knowledge extraction as captioning

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Multimodal fashion knowledge extraction as captioning

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Research Areas

Areas of Excellence

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Copyright Owner and License

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links