Publication Type
Conference Proceeding Article
Version
publishedVersion
Publication Date
7-2016
Abstract
This paper introduces a novel unsupervised outlier detection method, namely Coupled Biased Random Walks (CBRW), for identifying outliers in categorical data with diversified frequency distributions and many noisy features. Existing pattern-based outlier detection methods are ineffective in handling such complex scenarios, as they misfit such data. CBRW estimates outlier scores of feature values by modelling feature value level couplings, which carry intrinsic data characteristics, via biased random walks to handle this complex data. The outlier scores of feature values can either measure the outlierness of an object or facilitate the existing methods as a feature weighting and selection indicator. Substantial experiments show that CBRW can not only detect outliers in complex data significantly better than the state-of-the-art methods, but also greatly improve the performance of existing methods on data sets with many noisy features.
Discipline
Databases and Information Systems | Data Storage Systems
Research Areas
Data Science and Engineering
Publication
Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, 2016 July 9-15
First Page
1902
Last Page
1908
Identifier
10.5555/3060832.3060887
Publisher
ACM
City or Country
New York
Citation
PANG, Guansong; CAO, Longbing; and CHEN, Ling.
Outlier detection in complex categorical data by modeling the feature value couplings. (2016). Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, New York, 2016 July 9-15. 1902-1908.
Available at: https://ink.library.smu.edu.sg/sis_research/7146
Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-No Derivative Works 4.0 International License.