Visible to the public Biblio

Filters: Keyword is spammers  [Clear All Filters]
2019-02-25
Xu, H., Hu, L., Liu, P., Xiao, Y., Wang, W., Dayal, J., Wang, Q., Tang, Y..  2018.  Oases: An Online Scalable Spam Detection System for Social Networks. 2018 IEEE 11th International Conference on Cloud Computing (CLOUD). :98–105.
Web-based social networks enable new community-based opportunities for participants to engage, share their thoughts, and interact with each other. Theses related activities such as searching and advertising are threatened by spammers, content polluters, and malware disseminators. We propose a scalable spam detection system, termed Oases, for uncovering social spam in social networks using an online and scalable approach. The novelty of our design lies in two key components: (1) a decentralized DHT-based tree overlay deployment for harvesting and uncovering deceptive spam from social communities; and (2) a progressive aggregation tree for aggregating the properties of these spam posts for creating new spam classifiers to actively filter out new spam. We design and implement the prototype of Oases and discuss the design considerations of the proposed approach. Our large-scale experiments using real-world Twitter data demonstrate scalability, attractive load-balancing, and graceful efficiency in online spam detection for social networks.
2018-03-19
Shahid, U., Farooqi, S., Ahmad, R., Shafiq, Z., Srinivasan, P., Zaffar, F..  2017.  Accurate Detection of Automatically Spun Content via Stylometric Analysis. 2017 IEEE International Conference on Data Mining (ICDM). :425–434.

Spammers use automated content spinning techniques to evade plagiarism detection by search engines. Text spinners help spammers in evading plagiarism detectors by automatically restructuring sentences and replacing words or phrases with their synonyms. Prior work on spun content detection relies on the knowledge about the dictionary used by the text spinning software. In this work, we propose an approach to detect spun content and its seed without needing the text spinner's dictionary. Our key idea is that text spinners introduce stylometric artifacts that can be leveraged for detecting spun documents. We implement and evaluate our proposed approach on a corpus of spun documents that are generated using a popular text spinning software. The results show that our approach can not only accurately detect whether a document is spun but also identify its source (or seed) document - all without needing the dictionary used by the text spinner.