On Robust Image Spam Filtering via Comprehensive Visual Modeling
Abstract
The Internet has brought about fundamental changes in the way peoples generate and exchange media information. Over the last decade, unsolicited message images (image spams) have become one of the most serious problems for Internet service providers (ISPs), business firms and general end users. In this paper, we report a novel system called RoBoTs (Robust BoosTrap based spam detector) to support accurate and robust image spam filtering. The system is developed based on multiple visual properties extracted from different levels of granularity, aiming to capture more discriminative contents for effective spam image identification. In addition, a resampling based learning framework is developed to effectively integrate random forest and linear discriminative analysis (LDA) to generate comprehensive signature of spam images. It can facilitate more accurate and robust spam classification process with very limited amount of initial training examples. Using three public available test collections, the proposed system is empirically compared with the state-of-the-art techniques. Our results demonstrate its significantly higher performance from different perspectives. (C) 2015 Elsevier Ltd. All rights reserved.