Research Collection School Of Computing and Information Systems

Mining coherent anomaly collections on web data

Hanbo DAI, Singapore Management University
Feida ZHU, Singapore Management UniversityFollow
Ee-peng LIM, Singapore Management UniversityFollow
Hwee Hwa PANG, Singapore Management UniversityFollow

Publication Type

Conference Proceeding Article

Version

publishedVersion

Publication Date

11-2012

Abstract

The recent boom of weblogs and social media has attached increasing importance to the identification of suspicious users with unusual behavior, such as spammers or fraudulent reviewers. A typical spamming strategy is to employ multiple dummy accounts to collectively promote a target, be it a URL or a product. Consequently, these suspicious accounts exhibit certain coherent anomalous behavior identifiable as a collection. In this paper, we propose the concept of Coherent Anomaly Collection (CAC) to capture this kind of collections, and put forward an efficient algorithm to simultaneously find the top-K disjoint CACs together with their anomalous behavior patterns. Compared with existing approaches, our new algorithm can find disjoint anomaly collections with coherent extreme behavior without having to specify either their number or sizes. Results on real Twitter data show that our approach discovers meaningful and informative hashtag spammer groups of various sizes which are hard to detect by clustering-based methods.

Keywords

Anomaly/outlier detection, Anomaly collection/cluster

Discipline

Computer Sciences | Databases and Information Systems

Publication

CIKM'12: Proceedings of the 21st ACM International Conference on Information and Knowledge Management: October 29 - November 2, 2012, Maui, Hawaii

First Page

1557

Last Page

1561

ISBN

9781450311564

Identifier

10.1145/2396761.2398472

Publisher

ACM

City or Country

New York

Citation

DAI, Hanbo; ZHU, Feida; Ee-peng LIM; and Hwee Hwa PANG. Mining coherent anomaly collections on web data. (2012). CIKM'12: Proceedings of the 21st ACM International Conference on Information and Knowledge Management: October 29 - November 2, 2012, Maui, Hawaii. 1557-1561.
Available at: https://ink.library.smu.edu.sg/sis_research/2869

Creative Commons License

This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.

Additional URL

http://doi.org/10.1145/2396761.2398472

Download

Find it in your library

Included in

Databases and Information Systems Commons

COinS

Research Collection School Of Computing and Information Systems

Mining coherent anomaly collections on web data

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Search

Links

Browse

Links

Research Collection School Of Computing and Information Systems

Mining coherent anomaly collections on web data

Author

Publication Type

Version

Publication Date

Abstract

Keywords

Discipline

Publication

First Page

Last Page

ISBN

Identifier

Publisher

City or Country

Citation

Creative Commons License

Additional URL

Included in

Share

Search

Links

Browse

Links