Biblio
Phishing attacks are the most common form of attacks that can happen over the internet. This method involves attackers attempting to collect data of a user without his/her consent through emails, URLs, and any other link that leads to a deceptive page where a user is persuaded to commit specific actions that can lead to the successful completion of an attack. These attacks can allow an attacker to collect vital information of the user that can often allow the attacker to impersonate the victim and get things done that only the victim should have been able to do, such as carry out transactions, or message someone else, or simply accessing the victim's data. Many studies have been carried out to discuss possible approaches to prevent such attacks. This research work includes three machine learning algorithms to predict any websites' phishing status. In the experimentation these models are trained using URL based features and attempted to prevent Zero-Day attacks by using proposed software proposal that differentiates the legitimate websites and phishing websites by analyzing the website's URL. From observations, the random forest classifier performed with a precision of 97%, a recall 99%, and F1 Score is 97%. Proposed model is fast and efficient as it only works based on the URL and it does not use other resources for analysis, as was the case for past studies.
Black-box web application scanners are used to detect vulnerabilities in the web application without any knowledge of the source code. Recent research had shown their poor performance in detecting stored Cross-Site Scripting (XSS) and stored SQL Injection (SQLI). The detection efficiency of four black-box scanners on two testbeds, Wackopicko and Custom testbed Scanit (obtained from [5]), have been analyzed in this paper. The analysis showed that the scanners need to be improved for better detection of multi-step stored XSS and stored SQLI. This study involves the interaction between the selected scanners and the web application to measure their efficiency of inserting proper attack vectors in appropriate fields. The results of this research paper indicate that there is not much difference in terms of performance between open-source and commercial black-box scanners used in this research. However, it may depend on the policies and trust issues of the companies using them according to their needs. Some of the possible recommendations are provided to improve the detection rate of stored SQLI and stored XSS vulnerabilities in this paper. The study concludes that the state-of-the-art of automated black-box web application scanners in 2020 needs to be improved to detect stored XSS and stored SQLI more effectively.
Phishing is a form of cybercrime where an attacker imitates a real person / institution by promoting them as an official person or entity through e-mail or other communication mediums. In this type of cyber attack, the attacker sends malicious links or attachments through phishing e-mails that can perform various functions, including capturing the login credentials or account information of the victim. These e-mails harm victims because of money loss and identity theft. In this study, a software called "Anti Phishing Simulator'' was developed, giving information about the detection problem of phishing and how to detect phishing emails. With this software, phishing and spam mails are detected by examining mail contents. Classification of spam words added to the database by Bayesian algorithm is provided.
Data analytics is being increasingly used in cyber-security problems, and found to be useful in cases where data volumes and heterogeneity make it cumbersome for manual assessment by security experts. In practical cyber-security scenarios involving data-driven analytics, obtaining data with annotations (i.e. ground-truth labels) is a challenging and known limiting factor for many supervised security analytics task. Significant portions of the large datasets typically remain unlabelled, as the task of annotation is extensively manual and requires a huge amount of expert intervention. In this paper, we propose an effective active learning approach that can efficiently address this limitation in a practical cyber-security problem of Phishing categorization, whereby we use a human-machine collaborative approach to design a semi-supervised solution. An initial classifier is learnt on a small amount of the annotated data which in an iterative manner, is then gradually updated by shortlisting only relevant samples from the large pool of unlabelled data that are most likely to influence the classifier performance fast. Prioritized Active Learning shows a significant promise to achieve faster convergence in terms of the classification performance in a batch learning framework, and thus requiring even lesser effort for human annotation. An useful feature weight update technique combined with active learning shows promising classification performance for categorizing Phishing/malicious URLs without requiring a large amount of annotated training samples to be available during training. In experiments with several collections of PhishMonger's Targeted Brand dataset, the proposed method shows significant improvement over the baseline by as much as 12%.
Nowadays, cyber attacks affect many institutions and individuals, and they result in a serious financial loss for them. Phishing Attack is one of the most common types of cyber attacks which is aimed at exploiting people's weaknesses to obtain confidential information about them. This type of cyber attack threats almost all internet users and institutions. To reduce the financial loss caused by this type of attacks, there is a need for awareness of the users as well as applications with the ability to detect them. In the last quarter of 2016, Turkey appears to be second behind China with an impact rate of approximately 43% in the Phishing Attack Analysis report between 45 countries. In this study, firstly, the characteristics of this type of attack are explained, and then a machine learning based system is proposed to detect them. In the proposed system, some features were extracted by using Natural Language Processing (NLP) techniques. The system was implemented by examining URLs used in Phishing Attacks before opening them with using some extracted features. Many tests have been applied to the created system, and it is seen that the best algorithm among the tested ones is the Random Forest algorithm with a success rate of 89.9%.
Since the past 20 years the uses of web in daily life is increasing and becoming trend now. As the use of the web is increasing, the use of web application is also increasing. Apparently most of the web application exists up to today have some vulnerability that could be exploited by unauthorized person. Some of well-known web application vulnerabilities are Structured Query Language (SQL) Injection, Cross-Site Scripting (XSS) and Cross-Site Request Forgery (CSRF). By compromising with these web application vulnerabilities, the system cracker can gain information about the user and lead to the reputation of the respective organization. Usually the developers of web applications did not realize that their web applications have vulnerabilities. They only realize them when there is an attack or manipulation of their code by someone. This is normal as in a web application, there are thousands of lines of code, therefore it is not easy to detect if there are some loopholes. Nowadays as the hacking tools and hacking tutorials are easier to get, lots of new hackers are born. Even though SQL injection is very easy to protect against, there are still large numbers of the system on the internet are vulnerable to this type of attack because there will be a few subtle condition that can go undetected. Therefore, in this paper we propose a detection model for detecting and recognizing the web vulnerability which is; SQL Injection based on the defined and identified criteria. In addition, the proposed detection model will be able to generate a report regarding the vulnerability level of the web application. As the consequence, the proposed detection model should be able to decrease the possibility of the SQL Injection attack that can be launch onto the web application.
While because the range of web users have increased exponentially, thus has the quantity of attacks that decide to use it for malicious functions. The vulnerability that has become usually exploited is thought as cross-site scripting (XSS). Cross-site Scripting (XSS) refers to client-side code injection attack whereby a malicious user will execute malicious scripts (also usually stated as a malicious payload) into a legitimate web site or web based application. XSS is amongst the foremost rampant of web based application vulnerabilities and happens once an internet based application makes use of un-validated or un-encoded user input at intervals the output it generates. In such instances, the victim is unaware that their data is being transferred from a website that he/she trusts to a different site controlled by the malicious user. In this paper we shall focus on type 1 or "non-persistent cross-site scripting". With non-persistent cross-site scripting, malicious code or script is embedded in a Web request, and then partially or entirely echoed (or "reflected") by the Web server without encoding or validation in the Web response. The malicious code or script is then executed in the client's Web browser which could lead to several negative outcomes, such as the theft of session data and accessing sensitive data within cookies. In order for this type of cross-site scripting to be successful, a malicious user must coerce a user into clicking a link that triggers the non-persistent cross-site scripting attack. This is usually done through an email that encourages the user to click on a provided malicious link, or to visit a web site that is fraught with malicious links. In this paper it will be discussed and elaborated as to how attack surfaces related to type 1 or "non-persistent cross-site scripting" attack shall be reduced using secure development life cycle practices and techniques.
Massive online social networks with hundreds of millions of active users are increasingly being used by Cyber criminals to spread malicious software (malware) to exploit vulnerabilities on the machines of users for personal gain. Twitter is particularly susceptible to such activity as, with its 140 character limit, it is common for people to include URLs in their tweets to link to more detailed information, evidence, news reports and so on. URLs are often shortened so the endpoint is not obvious before a person clicks the link. Cyber criminals can exploit this to propagate malicious URLs on Twitter, for which the endpoint is a malicious server that performs unwanted actions on the person's machine. This is known as a drive-by-download. In this paper we develop a machine classification system to distinguish between malicious and benign URLs within seconds of the URL being clicked (i.e. `real-time'). We train the classifier using machine activity logs created while interacting with URLs extracted from Twitter data collected during a large global event - the Superbowl - and test it using data from another large sporting event - the Cricket World Cup. The results show that machine activity logs produce precision performances of up to 0.975 on training data from the first event and 0.747 on a test data from a second event. Furthermore, we examine the properties of the learned model to explain the relationship between machine activity and malicious software behaviour, and build a learning curve for the classifier to illustrate that very small samples of training data can be used with only a small detriment to performance.
Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.
Online Social Networks (OSNs) are continuously suffering from the negative impact of Cross-Site Scripting (XSS) vulnerabilities. This paper describes a novel framework for mitigating XSS attack on OSN-based platforms. It is completely based on the request authentication and view isolation approach. It detects XSS attack through validating string value extracted from the vulnerable checkpoint present in the web page by implementing string examination algorithm with the help of XSS attack vector repository. Any similarity (i.e. string is not validated) indicates the presence of malicious code injected by the attacker and finally it removes the script code to mitigate XSS attack. To assess the defending ability of our designed model, we have tested it on OSN-based web application i.e. Humhub. The experimental results revealed that our model discovers the XSS attack vectors with low false negatives and false positive rate tolerable performance overhead.
Content blocking is an important part of a per-formant, user-serving, privacy respecting web. Current content blockers work by building trust labels over URLs. While useful, this approach has many well understood shortcomings. Attackers may avoid detection by changing URLs or domains, bundling unwanted code with benign code, or inlining code in pages.The common flaw in existing approaches is that they evaluate code based on its delivery mechanism, not its behavior. In this work we address this problem by building a system for generating signatures of the privacy-and-security relevant behavior of executed JavaScript. Our system uses as the unit of analysis each script's behavior during each turn on the JavaScript event loop. Focusing on event loop turns allows us to build highly identifying signatures for JavaScript code that are robust against code obfuscation, code bundling, URL modification, and other common evasions, as well as handle unique aspects of web applications.This work makes the following contributions to the problem of measuring and improving content blocking on the web: First, we design and implement a novel system to build per-event-loop-turn signatures of JavaScript behavior through deep instrumentation of the Blink and V8 runtimes. Second, we apply these signatures to measure how much privacy-and-security harming code is missed by current content blockers, by using EasyList and EasyPrivacy as ground truth and finding scripts that have the same privacy and security harming patterns. We build 1,995,444 signatures of privacy-and-security relevant behaviors from 11,212 unique scripts blocked by filter lists, and find 3,589 unique scripts hosting known harmful code, but missed by filter lists, affecting 12.48% of websites measured. Third, we provide a taxonomy of ways scripts avoid detection and quantify the occurrence of each. Finally, we present defenses against these evasions, in the form of filter list additions where possible, and through a proposed, signature based system in other cases.As part of this work, we share the implementation of our signature-generation system, the data gathered by applying that system to the Alexa 100K, and 586 AdBlock Plus compatible filter list rules to block instances of currently blocked code being moved to new URLs.
As QR codes become ubiquitous, there is a prominent security threat of phishing and malware attacks that can be carried out by sharing rogue URLs through such codes. Several QR code scanner apps have become available in the past few years to combat such threats. Nevertheless, limited work exists in the literature evaluating such apps in the context of security. In this paper, we have investigated the status of existing secure QR code scanner apps for Android from a security point of view. We found that several of the so-called secure QR code scanner apps merely present the URL encoded in a QR code to the user rather than validating it against suitable threat databases. Further, many apps do not support basic security features such as displaying the URL to the user and asking for user confirmation before proceeding to open the URL in a browser. The most alarming issue that emerged during this study is that only two of the studied apps perform validation of the redirected URL associated with a QR code. We also tested the relevant apps with a set of benign, phishing and malware URLs collected from multiple sources. Overall, the results of our experiments imply that the protection offered by the examined secure QR code scanner apps against rogue URLs (especially malware URLs) is limited. Based on the findings of our investigation, we have distilled a set of key lessons and proposed design recommendations to enhance the security aspects of such apps.
Spams are unsolicited and unnecessary messages which may contain harmful codes or links for activation of malicious viruses and spywares. Increasing popularity of social networks attracts the spammers to perform malicious activities in social networks. So an efficient spam detection method is necessary for social networks. In this paper, feed forward neural network with back propagation based spam detection model is proposed. The quality of the learning process is improved by tuning initial weights of feed forward neural network using proposed enhanced step size firefly algorithm which reduces the time for finding optimal weights during the learning process. The model is applied for twitter dataset and the experimental results show that, the proposed model performs well in terms of accuracy and detection rate and has lower false positive rate.