Biblio
Reliability analysis of concurrent data based on Botnet modeling is conducted in this paper. At present, the detection methods for botnets are mainly focused on two aspects. The first type requires the monitoring of high-privilege systems, which will bring certain security risks to the terminal. The second type is to identify botnets by identifying spam or spam, which is not targeted. By introducing multi-dimensional permutation entropy, the impact of permutation entropy on the permutation entropy is calculated based on the data communicated between zombies, describing the complexity of the network traffic time series, and the clustering variance method can effectively solve the difficulty of the detection. This paper is organized based on the data complex structure analysis. The experimental results show acceptable performance.
Internet is the most widely used technology in the current era of information technology and it is embedded in daily life activities. Due to its extensive use in everyday life, it has many applications such as social media (Face book, WhatsApp, messenger etc.,) and other online applications such as online businesses, e-counseling, advertisement on websites, e-banking, e-hunting websites, e-doctor appointment and e-doctor opinion. The above mentioned applications of internet technology makes things very easy and accessible for human being in limited time, however, this technology is vulnerable to various security threats. A vital and severe threat associated with this technology or a particular application is “Phishing attack” which is used by attacker to usurp the network security. Phishing attacks includes fake E-mails, fake websites, fake applications which are used to steal their credentials or usurp their security. In this paper, a detailed overview of various phishing attacks, specifically their background knowledge, and solutions proposed in literature to address these issues using various techniques such as anti-phishing, honey pots and firewalls etc. Moreover, installation of intrusion detection systems (IDS) and intrusion detection and prevention system (IPS) in the networks to allow the authentic traffic in an operational network. In this work, we have conducted end use awareness campaign to educate and train the employs in order to minimize the occurrence probability of these attacks. The result analysis observed for this survey was quite excellent by means of its effectiveness to address the aforementioned issues.
Machine-learning solutions are successfully adopted in multiple contexts but the application of these techniques to the cyber security domain is complex and still immature. Among the many open issues that affect security systems based on machine learning, we concentrate on adversarial attacks that aim to affect the detection and prediction capabilities of machine-learning models. We consider realistic types of poisoning and evasion attacks targeting security solutions devoted to malware, spam and network intrusion detection. We explore the possible damages that an attacker can cause to a cyber detector and present some existing and original defensive techniques in the context of intrusion detection systems. This paper contains several performance evaluations that are based on extensive experiments using large traffic datasets. The results highlight that modern adversarial attacks are highly effective against machine-learning classifiers for cyber detection, and that existing solutions require improvements in several directions. The paper paves the way for more robust machine-learning-based techniques that can be integrated into cyber security platforms.
Spam is a genuine and irritating issue for quite a longtime. Despite the fact that a lot of arrangements have been advanced, there still remains a considerable measure to be advanced in separating spam messages all the more proficiently. These days a noteworthy issue in spam separating also as content characterization in common dialect handling is the colossal size of vector space because of the various element terms, which is normally the reason for broad figuring and moderate order. Extracting semantic implications from the substance of writings and utilizing these as highlight terms to develop the vector space, rather than utilizing words as highlight terms in convention ways, could decrease the component of vectors viably and advance the characterization in the meantime. In spite of the fact that there are a wide range of techniques to square spam messages, a large portion of program designers just mean to square spam messages from being conveyed to their customers. In this paper, we present an effective way to deal with keep spam messages from being exchanged.In this work, a Collaborative filtering approach with semantics-based text classification technology was proposed and the related feature terms were selected from the semantic meanings of the text content.
A robust and reliable system of detecting spam reviews is a crying need in todays world in order to purchase products without being cheated from online sites. In many online sites, there are options for posting reviews, and thus creating scopes for fake paid reviews or untruthful reviews. These concocted reviews can mislead the general public and put them in a perplexity whether to believe the review or not. Prominent machine learning techniques have been introduced to solve the problem of spam review detection. The majority of current research has concentrated on supervised learning methods, which require labeled data - an inadequacy when it comes to online review. Our focus in this article is to detect any deceptive text reviews. In order to achieve that we have worked with both labeled and unlabeled data and proposed deep learning methods for spam review detection which includes Multi-Layer Perceptron (MLP), Convolutional Neural Network (CNN) and a variant of Recurrent Neural Network (RNN) that is Long Short-Term Memory (LSTM). We have also applied some traditional machine learning classifiers such as Nave Bayes (NB), K Nearest Neighbor (KNN) and Support Vector Machine (SVM) to detect spam reviews and finally, we have shown the performance comparison for both traditional and deep learning classifiers.
Spam emails have been a chronic issue in computer security. They are very costly economically and extremely dangerous for computers and networks. Despite of the emergence of social networks and other Internet based information exchange venues, dependence on email communication has increased over the years and this dependence has resulted in an urgent need to improve spam filters. Although many spam filters have been created to help prevent these spam emails from entering a user's inbox, there is a lack or research focusing on text modifications. Currently, Naive Bayes is one of the most popular methods of spam classification because of its simplicity and efficiency. Naive Bayes is also very accurate; however, it is unable to correctly classify emails when they contain leetspeak or diacritics. Thus, in this proposes, we implemented a novel algorithm for enhancing the accuracy of the Naive Bayes Spam Filter so that it can detect text modifications and correctly classify the email as spam or ham. Our Python algorithm combines semantic based, keyword based, and machine learning algorithms to increase the accuracy of Naive Bayes compared to Spamassassin by over two hundred percent. Additionally, we have discovered a relationship between the length of the email and the spam score, indicating that Bayesian Poisoning, a controversial topic, is actually a real phenomenon and utilized by spammers.
The increasing volume of malicious content in social networks requires automated methods to detect and eliminate such content. This paper describes a supervised machine learning classification model that has been built to detect the distribution of malicious content in online social networks (ONSs). Multisource features have been used to detect social network posts that contain malicious Uniform Resource Locators (URLs). These URLs could direct users to websites that contain malicious content, drive-by download attacks, phishing, spam, and scams. For the data collection stage, the Twitter streaming application programming interface (API) was used and VirusTotal was used for labelling the dataset. A random forest classification model was used with a combination of features derived from a range of sources. The random forest model without any tuning and feature selection produced a recall value of 0.89. After further investigation and applying parameter tuning and feature selection methods, however, we were able to improve the classifier performance to 0.92 in recall.
Spammers use automated content spinning techniques to evade plagiarism detection by search engines. Text spinners help spammers in evading plagiarism detectors by automatically restructuring sentences and replacing words or phrases with their synonyms. Prior work on spun content detection relies on the knowledge about the dictionary used by the text spinning software. In this work, we propose an approach to detect spun content and its seed without needing the text spinner's dictionary. Our key idea is that text spinners introduce stylometric artifacts that can be leveraged for detecting spun documents. We implement and evaluate our proposed approach on a corpus of spun documents that are generated using a popular text spinning software. The results show that our approach can not only accurately detect whether a document is spun but also identify its source (or seed) document - all without needing the dictionary used by the text spinner.
Social media plays an integral part in individual's everyday lives as well as for companies. Social media brings numerous benefits in people's lives such as to keep in touch with close ones and specially with relatives who are overseas, to make new friends, buy products, share information and much more. Unfortunately, several threats also accompany the countless advantages of social media. The rapid growth of the online social networking sites provides more scope for criminals and cyber-criminals to carry out their illegal activities. Hackers have found different ways of exploiting these platform for their malicious gains. This research englobes some of the common threats on social media such as spam, malware, Trojan horse, cross-site scripting, industry espionage, cyber-bullying, cyber-stalking, social engineering attacks. The main purpose of the study to elaborates on phishing, malware and click-jacking attacks. The main purpose of the research, there is no particular research available on the forensic investigation for Facebook. There is no particular forensic investigation methodology and forensic tools available which can follow on the Facebook. There are several tools available to extract digital data but it's not properly tested for Facebook. Forensics investigation tool is used to extract evidence to determine what, when, where, who is responsible. This information is required to ensure that the sufficient evidence to take legal action against criminals.
Fifty-four percent of the global email traffic in October 2016 was spam and phishing messages. Those emails were commonly sent from compromised email accounts. Previous research has primarily focused on detecting incoming junk mail but not locally generated spam messages. State-of-the-art spam detection methods generally require the content of the email to be able to classify it as either spam or a regular message. This content is not available within encrypted messages or is prohibited due to data privacy. The object of the research presented is to detect an anomaly with the Origin-Destination Delivery Notification method, which is based on the geographical origin and destination as well as the Delivery Status Notification of the remote SMTP server without the knowledge of the email content. The proposed method detects an abused account after a few transferred emails; it is very flexible and can be adjusted for every environment and requirement.
Twitter is one of the most popular microblogging social systems, which provides a set of distinctive posting services operating in real time. The flexibility of these services has attracted unethical individuals, so-called "spammers", aiming at spreading malicious, phishing, and misleading information. Unfortunately, the existence of spam results non-ignorable problems related to search and user's privacy. In the battle of fighting spam, various detection methods have been designed, which work by automating the detection process using the "features" concept combined with machine learning methods. However, the existing features are not effective enough to adapt spammers' tactics due to the ease of manipulation in the features. Also, the graph features are not suitable for Twitter based applications, though the high performance obtainable when applying such features. In this paper, beyond the simple statistical features such as number of hashtags and number of URLs, we examine the time property through advancing the design of some features used in the literature, and proposing new time based features. The new design of features is divided between robust advanced statistical features incorporating explicitly the time attribute, and behavioral features identifying any posting behavior pattern. The experimental results show that the new form of features is able to classify correctly the majority of spammers with an accuracy higher than 93% when using Random Forest learning algorithm, applied on a collected and annotated data-set. The results obtained outperform the accuracy of the state of the art features by about 6%, proving the significance of leveraging time in detecting spam accounts.
Checking remote data possession is of crucial importance in public cloud storage. It enables the users to check whether their outsourced data have been kept intact without downloading the original data. The existing remote data possession checking (RDPC) protocols have been designed in the PKI (public key infrastructure) setting. The cloud server has to validate the users' certificates before storing the data uploaded by the users in order to prevent spam. This incurs considerable costs since numerous users may frequently upload data to the cloud server. This study addresses this problem with a new model of identity-based RDPC (ID-RDPC) protocols. The authors present the first ID-RDPC protocol proven to be secure assuming the hardness of the standard computational Diffie-Hellman problem. In addition to the structural advantage of elimination of certificate management and verification, the authors ID-RDPC protocol also outperforms the existing RDPC protocols in the PKI setting in terms of computation and communication.
Tor is a popular low-latency anonymous communication system. However, it is currently abused in various ways. Tor exit routers are frequently troubled by administrative and legal complaints. To gain an insight into such abuse, we design and implement a novel system, TorWard, for the discovery and systematic study of malicious traffic over Tor. The system can avoid legal and administrative complaints and allows the investigation to be performed in a sensitive environment such as a university campus. An IDS (Intrusion Detection System) is used to discover and classify malicious traffic. We performed comprehensive analysis and extensive real-world experiments to validate the feasibility and effectiveness of TorWard. Our data shows that around 10% Tor traffic can trigger IDS alerts. Malicious traffic includes P2P traffic, malware traffic (e.g., botnet traffic), DoS (Denial-of-Service) attack traffic, spam, and others. Around 200 known malware have been identified. To the best of our knowledge, we are the first to perform malicious traffic categorization over Tor.