Publication Type

Conference Proceeding Article

Version

acceptedVersion

Publication Date

9-2020

Abstract

Authorship attribution (AA), which is the task of finding the owner of a given text, is an important and widely studied research topic with many applications. Recent works have shown that deep learning methods could achieve significant accuracy improvement for the AA task. Nevertheless, most of these proposed methods represent user posts using a single type of features (e.g., word bi-grams) and adopt a text classification approach to address the task. Furthermore, these methods offer very limited explainability of the AA results. In this paper, we address these limitations by proposing DeepStyle, a novel embedding-based framework that learns the representations of users’ salient writing styles. We conduct extensive experiments on two real-world datasets from Twitter and Weibo. Our experiment results show that DeepStyle outperforms the state-of-the-art baselines on the AA task.

Keywords

Authorship attribution, Style embedding, Triplet loss

Discipline

Databases and Information Systems

Research Areas

Data Science and Engineering

Publication

Web and Big Data: 4th International Joint Conference, APWeb-WAIM 2020, Tianjin, China, September 18-20: Proceedings

Volume

12318

First Page

221

Last Page

229

ISBN

9783030602895

Identifier

10.1007/978-3-030-60290-1_17

Publisher

Springer

City or Country

Cham

Embargo Period

7-4-2021

Copyright Owner and License

Authors

505687_1_En_17_MOESM1_ESM.pdf (273 kB)
Supplementary Figure

Additional URL

https://doi.org/10.1007/978-3-030-60290-1_17

Share

COinS