Visible to the public Biblio

Filters: Keyword is spam detection  [Clear All Filters]
2023-02-17
Svadasu, Grandhi, Adimoolam, M..  2022.  Spam Detection in Social Media using Artificial Neural Network Algorithm and comparing Accuracy with Support Vector Machine Algorithm. 2022 International Conference on Business Analytics for Technology and Security (ICBATS). :1–5.
Aim: To bring off the spam detection in social media using Support Vector Machine (SVM) algorithm and compare accuracy with Artificial Neural Network (ANN) algorithm sample size of dataset is 5489, Initially the dataset contains several messages which includes spam and ham messages 80% messages are taken as training and 20% of messages are taken as testing. Materials and Methods: Classification was performed by KNN algorithm (N=10) for spam detection in social media and the accuracy was compared with SVM algorithm (N=10) with G power 80% and alpha value 0.05. Results: The value obtained in terms of accuracy was identified by ANN algorithm (98.2%) and for SVM algorithm (96.2%) with significant value 0.749. Conclusion: The accuracy of detecting spam using the ANN algorithm appears to be slightly better than the SVM algorithm.
Sasikala, V., Mounika, K., Sravya Tulasi, Y., Gayathri, D., Anjani, M..  2022.  Performance evaluation of Spam and Non-Spam E-mail detection using Machine Learning algorithms. 2022 International Conference on Electronics and Renewable Systems (ICEARS). :1359–1365.
All of us are familiar with the importance of social media in facilitating communication. e-mail is one of the safest social media platforms for online communications and information transfer over the internet. As of now, many people rely on email or communications provided by strangers. Because everyone may send emails or a message, spammers have a great opportunity to compose spam messages about our many hobbies and passions, interests, and concerns. Our internet speeds are severely slowed down by spam, which also collects personal information like our phone numbers from our contact list. There is a lot of work involved in identifying these fraudsters and also identifying spam content. Email spam refers to the practice of sending large numbers of messages via email. The recipient bears the bulk of the cost of spam, therefore it's practically free advertising. Spam email is a form of commercial advertising for hackers that is financially viable due of the low cost of sending email. Anti-spam filters have become increasingly important as the volume of unwanted bulk e-mail (also spamming) grows. We can define a message, if it is a spam or not using this proposed model. Machine learning algorithms can be discussed in detail, and our data sets will be used to test them all, with the goal of identifying the one that is most accurate and precise in its identification of email spam. Society of machine learning techniques for detecting unsolicited mass email and spam.
Das, Lipsa, Ahuja, Laxmi, Pandey, Adesh.  2022.  Analysis of Twitter Spam Detection Using Machine Learning Approach. 2022 3rd International Conference on Intelligent Engineering and Management (ICIEM). :764–769.
Now a days there are many online social networks (OSN) which are very popular among Internet users and use this platform for finding new connections, sharing their activities and thoughts. Twitter is such social media platforms which is very popular among this users. Survey says, it has more than 310 million monthly users who are very active and post around 500+ million tweets in a day and this attracts, the spammer or cyber-criminal to misuse this platform for their malicious benefits. Product advertisement, phishing true users, pornography propagation, stealing the trending news, sharing malicious link to get the victims for making money are the common example of the activities of spammers. In Aug-2014, Twitter made public that 8.5% of its active Twitter users (monthly) that is approx. 23+ million users, who have automatically contacted their servers for regular updates. Thus for a spam free environment in twitter, it is greatly required to detect and filter these spammer from the legitimate users. Here in our research paper, effectiveness & features of twitter spam detection, various methods are summarized with their benefits and limitations are presented. [1]
Xu, Mingming, Zhang, Lu, Zhu, Haiting.  2022.  Finding Collusive Spam in Community Question Answering Platforms: A Pattern and Burstiness Based Method. 2021 Ninth International Conference on Advanced Cloud and Big Data (CBD). :89–94.
Community question answering (CQA) websites have become very popular platforms attracting numerous participants to share and acquire knowledge and information in Internet However, with the rapid growth of crowdsourcing systems, many malicious users organize collusive attacks against the CQA platforms for promoting a target (product or service) via posting suggestive questions and deceptive answers. These manipulate deceptive contents, aggregating into multiple collusive questions and answers (Q&As) spam groups, can fully control the sentiment of a target and distort the decision of users, which pollute the CQA environment and make it less credible. In this paper, we propose a Pattern and Burstiness based Collusive Q&A Spam Detection method (PBCSD) to identify the deceptive questions and answers. Specifically, we intensively study the campaign process of crowdsourcing tasks and summarize the clues in the Q&As’ vocabulary usage level when collusive attacks are launched. Based on the clues, we extract the Q&A groups using frequent pattern mining and further purify them by the burstiness on posting time of Q&As. By designing several discriminative features at the Q&A group level, multiple machine learning based classifiers can be used to judge the groups as deceptive or ordinary, and the Q&As in deceptive groups are finally identified as collusive Q&A spam. We evaluate the proposed PBCSD method in a real-world dataset collected from Baidu Zhidao, a famous CQA platform in China, and the experimental results demonstrate the PBCSD is effective for collusive Q&A spam detection and outperforms a number of state-of-art methods.
Georgieva-Trifonova, Tsvetanka.  2022.  Research on Filtering Feature Selection Methods for E-Mail Spam Detection by Applying K-NN Classifier. 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA). :1–4.
In the present paper, the application of filtering methods to select features when detecting email spam using the K-NN classifier is examined. The experiments include computation of the accuracy and F-measure of the e-mail texts classification with different methods for feature selection, different number of selected features and two ways to find the distance between dataset examples when executing K-NN classifier - Euclidean distance and cosine similarity. The obtained results are summarized and analyzed.
Ubale, Ganesh, Gaikwad, Siddharth.  2022.  SMS Spam Detection Using TFIDF and Voting Classifier. 2022 International Mobile and Embedded Technology Conference (MECON). :363–366.
In today’s digital world, Mobile SMS (short message service) communication has almost become a part of every human life. Meanwhile each mobile user suffers from the harass of Spam SMS. These Spam SMS constitute veritable nuisance to mobile subscribers. Though hackers or spammers try to intrude in mobile computing devices, SMS support for mobile devices become more vulnerable as attacker tries to intrude into the system by sending unsolicited messages. An attacker can gain remote access over mobile devices. We propose a novel approach that can analyze message content and find features using the TF-IDF techniques to efficiently detect Spam Messages and Ham messages using different Machine Learning Classifiers. The Classifiers going to use in proposed work can be measured with the help of metrics such as Accuracy, Precision and Recall. In our proposed approach accuracy rate will be increased by using the Voting Classifier.
Belkhouche, Yassine.  2022.  A language processing-free unified spam detection framework using byte histograms and deep learning. 2022 Fourth International Conference on Transdisciplinary AI (TransAI). :83–86.
In this paper, we established a unified deep learning-based spam filtering method. The proposed method uses the message byte-histograms as a unified representation for all message types (text, images, or any other format). A deep convolutional neural network (CNN) is used to extract high-level features from this representation. A fully connected neural network is used to perform the classification using the extracted CNN features. We validate our method using several open-source text-based and image-based spam datasets.We obtained an accuracy higher than 94% on all datasets.
Yerima, Suleiman Y., Bashar, Abul.  2022.  Semi-supervised novelty detection with one class SVM for SMS spam detection. 2022 29th International Conference on Systems, Signals and Image Processing (IWSSIP). CFP2255E-ART:1–4.
The volume of SMS messages sent on a daily basis globally has continued to grow significantly over the past years. Hence, mobile phones are becoming increasingly vulnerable to SMS spam messages, thereby exposing users to the risk of fraud and theft of personal data. Filtering of messages to detect and eliminate SMS spam is now a critical functionality for which different types of machine learning approaches are still being explored. In this paper, we propose a system for detecting SMS spam using a semi-supervised novelty detection approach based on one class SVM classifier. The system is built as an anomaly detector that learns only from normal SMS messages thus enabling detection models to be implemented in the absence of labelled SMS spam training examples. We evaluated our proposed system using a benchmark dataset consisting of 747 SMS spam and 4827 non-spam messages. The results show that our proposed method out-performed the traditional supervised machine learning approaches based on binary, frequency or TF-IDF bag-of-words. The overall accuracy was 98% with 100% SMS spam detection rate and only around 3% false positive rate.
ISSN: 2157-8702
[Anonymous].  Submitted.  Spam image detection based on convolutional block attention module.
Digital communication platforms, such as Gmail and Yahoo, are become essential in our professional and personal lives. In addition to the low cost of e-mails, they are fast. Despite the advantages of these tools, spammers try to send unsolicited e-mail, known as spam, daily. Recently, image spam, a new type of spam e-mail, is developed by spammers in order to avoid detection based on text-based spam filtering systems. Image spam contains more complex information as compared to text spam. For this reason, the detection of image spam is still a challenging task for researchers. Most of the developed image spam filtering systems are based on hand-crafted features and machine learning techniques, which are time-consuming and less efficient. In addition, these systems do not focus on the important features, which can have an impact on the detection process. In this paper, we apply the convolutional block attention module (CBAM) model in order to address the problem of image spam. The experiments are conducted on the available dataset, called image spam hunter (ISH). The results obtained are then compared, using the CBAM model, to other existing state-of-the-art methods. The results obtained have demonstrated that the convolutional block attention module (CBAM) is efficient for image spam detection.
2020-05-18
Thejaswini, S, Indupriya, C.  2019.  Big Data Security Issues and Natural Language Processing. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :1307–1312.
Whenever we talk about big data, the concern is always about the security of the data. In recent days the most heard about technology is the Natural Language Processing. This new and trending technology helps in solving the ever ending security problems which are not completely solved using big data. Starting with the big data security issues, this paper deals with addressing the topics related to cyber security and information security using the Natural Language Processing technology. Including the well-known cyber-attacks such as phishing identification and spam detection, this paper also addresses issues on information assurance and security such as detection of Advanced Persistent Threat (APT) in DNS and vulnerability analysis. The goal of this paper is to provide the overview of how natural language processing can be used to address cyber security issues.
2020-02-10
Yao, Chuhao, Wang, Jiahong, Kodama, Eiichiro.  2019.  A Spam Review Detection Method by Verifying Consistency among Multiple Review Sites. 2019 IEEE 21st International Conference on High Performance Computing and Communications; IEEE 17th International Conference on Smart City; IEEE 5th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). :2825–2830.

In recent years, websites that incorporate user reviews, such as Amazon, IMDB and YELP, have become exceedingly popular. As an important factor affecting users purchasing behavior, review information has been becoming increasingly important, and accordingly, the reliability of review information becomes an important issue. This paper proposes a method to more accurately detect the appearance period of spam reviews and to identify the spam reviews by verifying the consistency of review information among multiple review sites. Evaluation experiments were conducted to show the accuracy of the detection results, and compared the newly proposed method with our previously proposed method.

Suryawanshi, Shubhangi, Goswami, Anurag, Patil, Pramod.  2019.  Email Spam Detection : An Empirical Comparative Study of Different ML and Ensemble Classifiers. 2019 IEEE 9th International Conference on Advanced Computing (IACC). :69–74.

Recent Development in Hardware and Software Technology for the communication email is preferred. But due to the unbidden emails, it affects communication. There is a need for detection and classification of spam email. In this present research email spam detection and classification, models are built. We have used different Machine learning classifiers like Naive Bayes, SVM, KNN, Bagging and Boosting (Adaboost), and Ensemble Classifiers with a voting mechanism. Evaluation and testing of classifiers is performed on email spam dataset from UCI Machine learning repository and Kaggle website. Different accuracy measures like Accuracy Score, F measure, Recall, Precision, Support and ROC are used. The preliminary result shows that Ensemble Classifier with a voting mechanism is the best to be used. It gives the minimum false positive rate and high accuracy.

Shyry, S. Prayla, Charan K, Venkat Sai, Kumar, V. Sudheer.  2019.  Spam Mail Detection and Prevention at Server Side. 2019 Innovations in Power and Advanced Computing Technologies (i-PACT). 1:1–6.

Spam is a genuine and irritating issue for quite a longtime. Despite the fact that a lot of arrangements have been advanced, there still remains a considerable measure to be advanced in separating spam messages all the more proficiently. These days a noteworthy issue in spam separating also as content characterization in common dialect handling is the colossal size of vector space because of the various element terms, which is normally the reason for broad figuring and moderate order. Extracting semantic implications from the substance of writings and utilizing these as highlight terms to develop the vector space, rather than utilizing words as highlight terms in convention ways, could decrease the component of vectors viably and advance the characterization in the meantime. In spite of the fact that there are a wide range of techniques to square spam messages, a large portion of program designers just mean to square spam messages from being conveyed to their customers. In this paper, we present an effective way to deal with keep spam messages from being exchanged.In this work, a Collaborative filtering approach with semantics-based text classification technology was proposed and the related feature terms were selected from the semantic meanings of the text content.

Shahariar, G. M., Biswas, Swapnil, Omar, Faiza, Shah, Faisal Muhammad, Binte Hassan, Samiha.  2019.  Spam Review Detection Using Deep Learning. 2019 IEEE 10th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). :0027–0033.

A robust and reliable system of detecting spam reviews is a crying need in todays world in order to purchase products without being cheated from online sites. In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the review or not. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. The majority of current research has concentrated on supervised learning methods, which require labeled data - an inadequacy when it comes to online review. Our focus in this article is to detect any deceptive text reviews. In order to achieve that we have worked with both labeled and unlabeled data and proposed deep learning methods for spam review detection which includes Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN) and a variant of Recurrent Neural Network (RNN) that is Long Short-Term Memory (LSTM). We have also applied some traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) to detect spam reviews and finally, we have shown the performance comparison for both traditional and deep learning classifiers.

Li, Meng, Wu, Bin, Wang, Yaning.  2019.  Comment Spam Detection via Effective Features Combination. ICC 2019 - 2019 IEEE International Conference on Communications (ICC). :1–6.

Comment spam is one of the great challenges faced by forum administrators. Detecting and blocking comment spam can relieve the load on servers, improve user experience and purify the network conditions. This paper focuses on the detection of comment spam. The behaviors of spammer and the content of spam were analyzed. According to analysis results, two types of effective features are extracted which can make a better description of spammer characteristics. Additionally, a gradient boosting tree algorithm was used to construct the comment spam detector based on the extracted features. Our proposed method is examined on a blog spam dataset which was published by previous research, and the result illustrates that our method performs better than the previous method on detection accuracy. Moreover, the CPU time is recorded to demonstrate that the time spent on both training and testing maintains a small value.

Lekha, J., Maheshwaran, J, Tharani, K, Ram, Prathap K, Surya, Murthy K, Manikandan, A.  2019.  Efficient Detection of Spam Messages Using OBF and CBF Blocking Techniques. 2019 3rd International Conference on Trends in Electronics and Informatics (ICOEI). :1175–1179.

Emails are the fundamental unit of web applications. There is an exponential growth in sending and receiving emails online. However, spam mail has turned into an intense issue in email correspondence condition. There are number of substance based channel systems accessible to be specific content based filter(CBF), picture based sifting and many other systems to channel spam messages. The existing technological solution consists of a combination of porter stemer algorithm(PSA) and k means clustering which is adaptive in nature. These procedures are more expensive in regard of the calculation and system assets as they required the examination of entire spam message and calculation of the entire substance of the server. These are the channels must additionally not powerful in nature life on the grounds that the idea of spam block mail and spamming changes much of the time. We propose a starting point based spam mail-sifting system benefit, which works considering top head notcher data of the mail message paying little respect to the body substance of the mail. It streamlines the system and server execution by increasing the precision, recall and accuracy than the existing methods. To design an effective and efficient of autonomous and efficient spam detection system to improve network performance from unknown privileged user attacks.

Krause, Tim, Uetz, Rafael, Kretschmann, Tim.  2019.  Recognizing Email Spam from Meta Data Only. 2019 IEEE Conference on Communications and Network Security (CNS). :178–186.

We propose a new spam detection approach based solely on meta data features gained from email headers. The approach achieves above 99 % classification accuracy on the CSDMC2010 dataset, which matches or surpasses state-of-the-art spam classifiers. We utilize a static set of engineered features, supplemented with automatically extracted features. The approach is just as effective for spam detection in end-to-end encryption, as our feature set remains unchanged for encrypted emails. In contrast to most established spam detectors, we disregard the email body completely and can therefore deliver very high classification speeds, as computationally expensive text preprocessing is not necessary.

Ishtiaq, Asra, Islam, Muhammad Arshad, Azhar Iqbal, Muhammad, Aleem, Muhammad, Ahmed, Usman.  2019.  Graph Centrality Based Spam SMS Detection. 2019 16th International Bhurban Conference on Applied Sciences and Technology (IBCAST). :629–633.

Short messages usage has been tremendously increased such as SMS, tweets and status updates. Due to its popularity and ease of use, many companies use it for advertisement purpose. Hackers also use SMS to defraud users and steal personal information. In this paper, the use of Graphs centrality metrics is proposed for spam SMS detection. The graph centrality measures: degree, closeness, and eccentricity are used for classification of SMS. Graphs for each class are created using labeled SMS and then unlabeled SMS is classified using the centrality scores of the token available in the unclassified SMS. Our results show that highest precision and recall is achieved by using degree centrality. Degree centrality achieved the highest precision i.e. 0.81 and recall i.e., 0.76 for spam messages.

Eshmawi, Ala', Nair, Suku.  2019.  The Roving Proxy Framewrok for SMS Spam and Phishing Detection. 2019 2nd International Conference on Computer Applications Information Security (ICCAIS). :1–6.

This paper presents the details of the roving proxy framework for SMS spam and SMS phishing (SMishing) detection. The framework aims to protect organizations and enterprises from the danger of SMishing attacks. Feasibility and functionality studies of the framework are presented along with an update process study to define the minimum requirements for the system to adapt with the latest spam and SMishing trends.

Elakkiya, E, Selvakumar, S.  2019.  Initial Weights Optimization Using Enhanced Step Size Firefly Algorithm for Feed Forward Neural Network Applied to Spam Detection. TENCON 2019 - 2019 IEEE Region 10 Conference (TENCON). :942–946.

Spams are unsolicited and unnecessary messages which may contain harmful codes or links for activation of malicious viruses and spywares. Increasing popularity of social networks attracts the spammers to perform malicious activities in social networks. So an efficient spam detection method is necessary for social networks. In this paper, feed forward neural network with back propagation based spam detection model is proposed. The quality of the learning process is improved by tuning initial weights of feed forward neural network using proposed enhanced step size firefly algorithm which reduces the time for finding optimal weights during the learning process. The model is applied for twitter dataset and the experimental results show that, the proposed model performs well in terms of accuracy and detection rate and has lower false positive rate. 

Dan, Kenya, Kitagawa, Naoya, Sakuraba, Shuji, Yamai, Nariyoshi.  2019.  Spam Domain Detection Method Using Active DNS Data and E-Mail Reception Log. 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 1:896–899.

E-mail is widespread and an essential communication technology in modern times. Since e-mail has problems with spam mails and spoofed e-mails, countermeasures are required. Although SPF, DKIM and DMARC have been proposed as sender domain authentication, these mechanisms cannot detect non-spoofing spam mails. To overcome this issue, this paper proposes a method to detect spam domains by supervised learning with features extracted from e-mail reception log and active DNS data, such as the result of Sender Authentication, the Sender IP address, the number of each DNS record, and so on. As a result of the experiment, our method can detect spam domains with 88.09% accuracy and 97.11% precision. We confirmed that our method can detect spam domains with detection accuracy 19.40% higher than the previous study by utilizing not only active DNS data but also e-mail reception log in combination.

2019-02-25
Liu, Ninghao, Yang, Hongxia, Hu, Xia.  2018.  Adversarial Detection with Model Interpretation. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. :1803–1811.
Machine learning (ML) systems have been increasingly applied in web security applications such as spammer detection, malware detection and fraud detection. These applications have an intrinsic adversarial nature where intelligent attackers can adaptively change their behaviors to avoid being detected by the deployed detectors. Existing efforts against adversaries are usually limited by the type of applied ML models or the specific applications such as image classification. Additionally, the working mechanisms of ML models usually cannot be well understood by users, which in turn impede them from understanding the vulnerabilities of models nor improving their robustness. To bridge the gap, in this paper, we propose to investigate whether model interpretation could potentially help adversarial detection. Specifically, we develop a novel adversary-resistant detection framework by utilizing the interpretation of ML models. The interpretation process explains the mechanism of how the target ML model makes prediction for a given instance, thus providing more insights for crafting adversarial samples. The robustness of detectors is then improved through adversarial training with the adversarial samples. A data-driven method is also developed to empirically estimate costs of adversaries in feature manipulation. Our approach is model-agnostic and can be applied to various types of classification models. Our experimental results on two real-world datasets demonstrate the effectiveness of interpretation-based attacks and how estimated feature manipulation cost would affect the behavior of adversaries.
Ho, Kenny, Liesaputra, Veronica, Yongchareon, Sira, Mohaghegh, Mahsa.  2018.  Evaluating Social Spammer Detection Systems. Proceedings of the Australasian Computer Science Week Multiconference. :18:1–18:7.
The rising popularity of social network services, such as Twitter, has attracted many spammers and created a large number of fake accounts, overwhelming legitimate users with advertising, malware and unwanted and disruptive information. This not only inconveniences the users' social activities but causes financial loss and privacy issues. Identifying social spammers is challenging because spammers continually change their strategies to fool existing anti-spamming systems. Thus, many researchers have tried to propose new classification systems using various types of features extracted from the content and user's information. However, no comprehensive comparative study has been done to compare the effectiveness and the efficiency of the existing systems. At this stage, it is hard to know what the best anti spamming system is and why. This paper proposes a unified evaluation workbench that allows researchers to access various user and content-based features, implement new features, and evaluate and compare the performance of their systems against existing systems. Through our analysis, we can identify the most effective and efficient social spammer detection features and help develop a faster and more accurate classifier model that has higher true positives and lower false positives.
Gupta, M., Bakliwal, A., Agarwal, S., Mehndiratta, P..  2018.  A Comparative Study of Spam SMS Detection Using Machine Learning Classifiers. 2018 Eleventh International Conference on Contemporary Computing (IC3). :1–7.
With technological advancements and increment in content based advertisement, the use of Short Message Service (SMS) on phones has increased to such a significant level that devices are sometimes flooded with a number of spam SMS. These spam messages can lead to loss of private data as well. There are many content-based machine learning techniques which have proven to be effective in filtering spam emails. Modern day researchers have used some stylistic features of text messages to classify them to be ham or spam. SMS spam detection can be greatly influenced by the presence of known words, phrases, abbreviations and idioms. This paper aims to compare different classifying techniques on different datasets collected from previous research works, and evaluate them on the basis of their accuracies, precision, recall and CAP Curve. The comparison has been performed between traditional machine learning techniques and deep learning methods.
Karamollaoglu, H., Dogru, İ A., Dorterler, M..  2018.  Detection of Spam E-mails with Machine Learning Methods. 2018 Innovations in Intelligent Systems and Applications Conference (ASYU). :1–5.

E-mail communication is one of today's indispensable communication ways. The widespread use of email has brought about some problems. The most important one of these problems are spam (unwanted) e-mails, often composed of advertisements or offensive content, sent without the recipient's request. In this study, it is aimed to analyze the content information of e-mails written in Turkish with the help of Naive Bayes Classifier and Vector Space Model from machine learning methods, to determine whether these e-mails are spam e-mails and classify them. Both methods are subjected to different evaluation criteria and their performances are compared.