Visible to the public Biblio

Filters: Keyword is Phishing Detection  [Clear All Filters]
2023-02-03
Sarasjati, Wendy, Rustad, Supriadi, Purwanto, Santoso, Heru Agus, Muljono, Syukur, Abdul, Rafrastara, Fauzi Adi, Ignatius Moses Setiadi, De Rosal.  2022.  Comparative Study of Classification Algorithms for Website Phishing Detection on Multiple Datasets. 2022 International Seminar on Application for Technology of Information and Communication (iSemantic). :448–452.
Phishing has become a prominent method of data theft among hackers, and it continues to develop. In recent years, many strategies have been developed to identify phishing website attempts using machine learning particularly. However, the algorithms and classification criteria that have been used are highly different from the real issues and need to be compared. This paper provides a detailed comparison and evaluation of the performance of several machine learning algorithms across multiple datasets. Two phishing website datasets were used for the experiments: the Phishing Websites Dataset from UCI (2016) and the Phishing Websites Dataset from Mendeley (2018). Because these datasets include different types of class labels, the comparison algorithms can be applied in a variety of situations. The tests showed that Random Forest was better than other classification methods, with an accuracy of 88.92% for the UCI dataset and 97.50% for the Mendeley dataset.
Alkawaz, Mohammed Hazim, Joanne Steven, Stephanie, Mohammad, Omar Farook, Gapar Md Johar, Md.  2022.  Identification and Analysis of Phishing Website based on Machine Learning Methods. 2022 IEEE 12th Symposium on Computer Applications & Industrial Electronics (ISCAIE). :246–251.
People are increasingly sharing their details online as internet usage grows. Therefore, fraudsters have access to a massive amount of information and financial activities. The attackers create web pages that seem like reputable sites and transmit the malevolent content to victims to get them to provide subtle information. Prevailing phishing security measures are inadequate for detecting new phishing assaults. To accomplish this aim, objective to meet for this research is to analyses and compare phishing website and legitimate by analyzing the data collected from open-source platforms through a survey. Another objective for this research is to propose a method to detect fake sites using Decision Tree and Random Forest approaches. Microsoft Form has been utilized to carry out the survey with 30 participants. Majority of the participants have poor awareness and phishing attack and does not obverse the features of interface before accessing the search browser. With the data collection, this survey supports the purpose of identifying the best phishing website detection where Decision Tree and Random Forest were trained and tested. In achieving high number of feature importance detection and accuracy rate, the result demonstrates that Random Forest has the best performance in phishing website detection compared to Decision Tree.
Patil, Kanchan, Arra, Sai Rohith.  2022.  Detection of Phishing and User Awareness Training in Information Security: A Systematic Literature Review. 2022 2nd International Conference on Innovative Practices in Technology and Management (ICIPTM). 2:780–786.
Phishing is a method of online fraud where attackers are targeted to gain access to the computer systems for monetary benefits or personal gains. In this case, the attackers pose themselves as legitimate entities to gain the users' sensitive information. Phishing has been significant concern over the past few years. The firms are recording an increase in phishing attacks primarily aimed at the firm's intellectual property and the employees' sensitive data. As a result, these attacks force firms to spend more on information security, both in technology-centric and human-centric approaches. With the advancements in cyber-security in the last ten years, many techniques evolved to detect phishing-related activities through websites and emails. This study focuses on the latest techniques used for detecting phishing attacks, including the usage of Visual selection features, Machine Learning (ML), and Artificial Intelligence (AI) to see the phishing attacks. New strategies for identifying phishing attacks are evolving, but limited standardized knowledge on phishing identification and mitigation is accessible from user awareness training. So, this study also focuses on the role of security-awareness movements to minimize the impact of phishing attacks. There are many approaches to train the user regarding these attacks, such as persona-centred training, anti-phishing techniques, visual discrimination training and the usage of spam filters, robust firewalls and infrastructure, dynamic technical defense mechanisms, use of third-party certified software to mitigate phishing attacks from happening. Therefore, the purpose of this paper is to carry out a systematic analysis of literature to assess the state of knowledge in prominent scientific journals on the identification and prevention of phishing. Forty-three journal articles with the perspective of phishing detection and prevention through awareness training were reviewed from 2011 to 2020. This timely systematic review also focuses on the gaps identified in the selected primary studies and future research directions in this area.
2022-10-13
Basit, Abdul, Zafar, Maham, Javed, Abdul Rehman, Jalil, Zunera.  2020.  A Novel Ensemble Machine Learning Method to Detect Phishing Attack. 2020 IEEE 23rd International Multitopic Conference (INMIC). :1—5.
Currently and particularly with remote working scenarios during COVID-19, phishing attack has become one of the most significant threats faced by internet users, organizations, and service providers. In a phishing attack, the attacker tries to steal client sensitive data (such as login, passwords, and credit card details) using spoofed emails and fake websites. Cybercriminals, hacktivists, and nation-state spy agencies have now got a fertilized ground to deploy their latest innovative phishing attacks. Timely detection of phishing attacks has become most crucial than ever. Machine learning algorithms can be used to accurately detect phishing attacks before a user is harmed. This paper presents a novel ensemble model to detect phishing attacks on the website. We select three machine learning classifiers: Artificial Neural Network (ANN), K-Nearest Neighbors (KNN), and Decision Tree (C4.5) to use in an ensemble method with Random Forest Classifier (RFC). This ensemble method effectively detects website phishing attacks with better accuracy than existing studies. Experimental results demonstrate that the ensemble of KNN and RFC detects phishing attacks with 97.33% accuracy.
2022-10-12
Faris, Humam, Yazid, Setiadi.  2021.  Phishing Web Page Detection Methods: URL and HTML Features Detection. 2020 IEEE International Conference on Internet of Things and Intelligence System (IoTaIS). :167—171.
Phishing is a type of fraud on the Internet in the form of fake web pages that mimic the original web pages to trick users into sending sensitive information to phisher. The statistics presented by APWG and Phistank show that the number of phishing websites from 2015 to 2020 tends to increase continuously. To overcome this problem, several studies have been carried out including detecting phishing web pages using various features of web pages with various methods. Unfortunately, the use of several methods is not really effective because the design and evaluation are only too focused on the achievement of detection accuracy in research, but evaluation does not represent application in the real world. Whereas a security detection device should require effectiveness, good performance, and deployable. In this study the authors evaluated several methods and proposed rules-based applications that can detect phishing more efficiently.
Deval, Shalin Kumar, Tripathi, Meenakshi, Bezawada, Bruhadeshwar, Ray, Indrakshi.  2021.  “X-Phish: Days of Future Past”‡: Adaptive & Privacy Preserving Phishing Detection. 2021 IEEE Conference on Communications and Network Security (CNS). :227—235.
Website phishing continues to persist as one of the most important security threats of the modern Internet era. A major concern has been that machine learning based approaches, which have been the cornerstones of deployed phishing detection solutions, have not been able to adapt to the evolving nature of the phishing attacks. To create updated machine learning models, the collection of a sufficient corpus of real-time phishing data has always been a challenging problem as most phishing websites are short-lived. In this work, for the first time, we address these important concerns and describe an adaptive phishing detection solution that is able to adapt to changes in phishing attacks. Our solution has two major contributions. First, our solution allows for multiple organizations to collaborate in a privacy preserving manner and generate a robust machine learning model for phishing detection. Second, our solution is designed to be flexible in order to adapt to the novel phishing features introduced by attackers. Our solution not only allows for incorporating novel features into the existing machine learning model, but also can help, to a certain extent, the “unlearning” of existing features that have become obsolete in current phishing attacks. We evaluated our approach on a large real-world data collected over a period of six months. Our results achieve a high true positive rate of 97 %, which is on par with existing state-of-the art centralized solutions. Importantly, our results demonstrate that, a machine learning model can incorporate new features while selectively “unlearning” the older obsolete features.
2022-02-25
Abutaha, Mohammed, Ababneh, Mohammad, Mahmoud, Khaled, Baddar, Sherenaz Al-Haj.  2021.  URL Phishing Detection using Machine Learning Techniques based on URLs Lexical Analysis. 2021 12th International Conference on Information and Communication Systems (ICICS). :147—152.
Phishing URLs mainly target individuals and/or organizations through social engineering attacks by exploiting the humans' weaknesses in information security awareness. These URLs lure online users to access fake websites, and harvest their confidential information, such as debit/credit card numbers and other sensitive information. In this work, we introduce a phishing detection technique based on URL lexical analysis and machine learning classifiers. The experiments were carried out on a dataset that originally contained 1056937 labeled URLs (phishing and legitimate). This dataset was processed to generate 22 different features that were reduced further to a smaller set using different features reduction techniques. Random Forest, Gradient Boosting, Neural Network and Support Vector Machine (SVM) classifiers were all evaluated, and results show the superiority of SVMs, which achieved the highest accuracy in detecting the analyzed URLs with a rate of 99.89%. Our approach can be incorporated within add-on/middleware features in Internet browsers for alerting online users whenever they try to access a phishing website using only its URL.
2021-09-21
Sathya, K, Premalatha, J, Suwathika, S.  2020.  Reinforcing Cyber World Security with Deep Learning Approaches. 2020 International Conference on Communication and Signal Processing (ICCSP). :0766–0769.
In the past decade, the Machine Learning (ML) and Deep learning (DL) has produced much research interest in the society and attracted them. Now-a-days, the Internet and social life make a lead in most of their life but it has serious social threats. It is a challenging thing to protect the sensitive information, data network and the computers which are in unauthorized cyber-attacks. For protecting the data's we need the cyber security. For these problems, the recent technologies of Deep learning and Machine Learning are integrated with the cyber-attacks to provide the solution for the problems. This paper gives a synopsis of utilizing deep learning to enhance the security of cyber world and various challenges in integrating deep learning into cyber security are analyzed.
2020-04-10
Robic-Butez, Pierrick, Win, Thu Yein.  2019.  Detection of Phishing websites using Generative Adversarial Network. 2019 IEEE International Conference on Big Data (Big Data). :3216—3221.

Phishing is typically deployed as an attack vector in the initial stages of a hacking endeavour. Due to it low-risk rightreward nature it has seen a widespread adoption, and detecting it has become a challenge in recent times. This paper proposes a novel means of detecting phishing websites using a Generative Adversarial Network. Taking into account the internal structure and external metadata of a website, the proposed approach uses a generator network which generates both legitimate as well as synthetic phishing features to train a discriminator network. The latter then determines if the features are either normal or phishing websites, before improving its detection accuracy based on the classification error. The proposed approach is evaluated using two different phishing datasets and is found to achieve a detection accuracy of up to 94%.

Huang, Yongjie, Qin, Jinghui, Wen, Wushao.  2019.  Phishing URL Detection Via Capsule-Based Neural Network. 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and Identification (ASID). :22—26.

As a cyber attack which leverages social engineering and other sophisticated techniques to steal sensitive information from users, phishing attack has been a critical threat to cyber security for a long time. Although researchers have proposed lots of countermeasures, phishing criminals figure out circumventions eventually since such countermeasures require substantial manual feature engineering and can not detect newly emerging phishing attacks well enough, which makes developing an efficient and effective phishing detection method an urgent need. In this work, we propose a novel phishing website detection approach by detecting the Uniform Resource Locator (URL) of a website, which is proved to be an effective and efficient detection approach. To be specific, our novel capsule-based neural network mainly includes several parallel branches wherein one convolutional layer extracts shallow features from URLs and the subsequent two capsule layers generate accurate feature representations of URLs from the shallow features and discriminate the legitimacy of URLs. The final output of our approach is obtained by averaging the outputs of all branches. Extensive experiments on a validated dataset collected from the Internet demonstrate that our approach can achieve competitive performance against other state-of-the-art detection methods while maintaining a tolerable time overhead.

Yadollahi, Mohammad Mehdi, Shoeleh, Farzaneh, Serkani, Elham, Madani, Afsaneh, Gharaee, Hossein.  2019.  An Adaptive Machine Learning Based Approach for Phishing Detection Using Hybrid Features. 2019 5th International Conference on Web Research (ICWR). :281—286.

Nowadays, phishing is one of the most usual web threats with regards to the significant growth of the World Wide Web in volume over time. Phishing attackers always use new (zero-day) and sophisticated techniques to deceive online customers. Hence, it is necessary that the anti-phishing system be real-time and fast and also leverages from an intelligent phishing detection solution. Here, we develop a reliable detection system which can adaptively match the changing environment and phishing websites. Our method is an online and feature-rich machine learning technique to discriminate the phishing and legitimate websites. Since the proposed approach extracts different types of discriminative features from URLs and webpages source code, it is an entirely client-side solution and does not require any service from the third-party. The experimental results highlight the robustness and competitiveness of our anti-phishing system to distinguish the phishing and legitimate websites.

Chapla, Happy, Kotak, Riddhi, Joiser, Mittal.  2019.  A Machine Learning Approach for URL Based Web Phishing Using Fuzzy Logic as Classifier. 2019 International Conference on Communication and Electronics Systems (ICCES). :383—388.

Phishing is the major problem of the internet era. In this era of internet the security of our data in web is gaining an increasing importance. Phishing is one of the most harmful ways to unknowingly access the credential information like username, password or account number from the users. Users are not aware of this type of attack and later they will also become a part of the phishing attacks. It may be the losses of financial found, personal information, reputation of brand name or trust of brand. So the detection of phishing site is necessary. In this paper we design a framework of phishing detection using URL.

2020-01-20
Huang, Yongjie, Yang, Qiping, Qin, Jinghui, Wen, Wushao.  2019.  Phishing URL Detection via CNN and Attention-Based Hierarchical RNN. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :112–119.
Phishing websites have long been a serious threat to cyber security. For decades, many researchers have been devoted to developing novel techniques to detect phishing websites automatically. While state-of-the-art solutions can achieve superior performances, they require substantial manual feature engineering and are not adept at detecting newly emerging phishing attacks. Therefore, developing techniques that can detect phishing websites automatically and handle zero-day phishing attacks swiftly is still an open challenge in this area. In this work, we propose PhishingNet, a deep learning-based approach for timely detection of phishing Uniform Resource Locators (URLs). Specifically, we use a Convolutional Neural Network (CNN) module to extract character-level spatial feature representations of URLs; meanwhile, we employ an attention-based hierarchical Recurrent Neural Network(RNN) module to extract word-level temporal feature representations of URLs. We then fuse these feature representations via a three-layer CNN to build accurate feature representations of URLs, on which we train a phishing URL classifier. Extensive experiments on a verified dataset collected from the Internet demonstrate that the feature representations extracted automatically are conducive to the improvement of the generalization ability of our approach on newly emerging URLs, which makes our approach achieve competitive performance against other state-of-the-art approaches.
2019-11-26
Shirazi, Hossein, Bezawada, Bruhadeshwar, Ray, Indrakshi.  2018.  "Kn0W Thy Doma1N Name": Unbiased Phishing Detection Using Domain Name Based Features. Proceedings of the 23Nd ACM on Symposium on Access Control Models and Technologies. :69-75.

Phishing websites remain a persistent security threat. Thus far, machine learning approaches appear to have the best potential as defenses. But, there are two main concerns with existing machine learning approaches for phishing detection. The first is the large number of training features used and the lack of validating arguments for these feature choices. The second concern is the type of datasets used in the literature that are inadvertently biased with respect to the features based on the website URL or content. To address these concerns, we put forward the intuition that the domain name of phishing websites is the tell-tale sign of phishing and holds the key to successful phishing detection. Accordingly, we design features that model the relationships, visual as well as statistical, of the domain name to the key elements of a phishing website, which are used to snare the end-users. The main value of our feature design is that, to bypass detection, an attacker will find it very difficult to tamper with the visual content of the phishing website without arousing the suspicion of the end user. Our feature set ensures that there is minimal or no bias with respect to a dataset. Our learning model trains with only seven features and achieves a true positive rate of 98% and a classification accuracy of 97%, on sample dataset. Compared to the state-of-the-art work, our per data instance classification is 4 times faster for legitimate websites and 10 times faster for phishing websites. Importantly, we demonstrate the shortcomings of using features based on URLs as they are likely to be biased towards specific datasets. We show the robustness of our learning algorithm by testing on unknown live phishing URLs and achieve a high detection accuracy of \$99.7%\$.

Zabihimayvan, Mahdieh, Doran, Derek.  2019.  Fuzzy Rough Set Feature Selection to Enhance Phishing Attack Detection. 2019 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE). :1-6.

Phishing as one of the most well-known cybercrime activities is a deception of online users to steal their personal or confidential information by impersonating a legitimate website. Several machine learning-based strategies have been proposed to detect phishing websites. These techniques are dependent on the features extracted from the website samples. However, few studies have actually considered efficient feature selection for detecting phishing attacks. In this work, we investigate an agreement on the definitive features which should be used in phishing detection. We apply Fuzzy Rough Set (FRS) theory as a tool to select most effective features from three benchmarked data sets. The selected features are fed into three often used classifiers for phishing detection. To evaluate the FRS feature selection in developing a generalizable phishing detection, the classifiers are trained by a separate out-of-sample data set of 14,000 website samples. The maximum F-measure gained by FRS feature selection is 95% using Random Forest classification. Also, there are 9 universal features selected by FRS over all the three data sets. The F-measure value using this universal feature set is approximately 93% which is a comparable result in contrast to the FRS performance. Since the universal feature set contains no features from third-part services, this finding implies that with no inquiry from external sources, we can gain a faster phishing detection which is also robust toward zero-day attacks.

Patil, Srushti, Dhage, Sudhir.  2019.  A Methodical Overview on Phishing Detection along with an Organized Way to Construct an Anti-Phishing Framework. 2019 5th International Conference on Advanced Computing Communication Systems (ICACCS). :588-593.

Phishing is a security attack to acquire personal information like passwords, credit card details or other account details of a user by means of websites or emails. Phishing websites look similar to the legitimate ones which make it difficult for a layman to differentiate between them. As per the reports of Anti Phishing Working Group (APWG) published in December 2018, phishing against banking services and payment processor was high. Almost all the phishy URLs use HTTPS and use redirects to avoid getting detected. This paper presents a focused literature survey of methods available to detect phishing websites. A comparative study of the in-use anti-phishing tools was accomplished and their limitations were acknowledged. We analyzed the URL-based features used in the past to improve their definitions as per the current scenario which is our major contribution. Also, a step wise procedure of designing an anti-phishing model is discussed to construct an efficient framework which adds to our contribution. Observations made out of this study are stated along with recommendations on existing systems.

2019-05-01
Shirsat, S. D..  2018.  Demonstrating Different Phishing Attacks Using Fuzzy Logic. 2018 Second International Conference on Inventive Communication and Computational Technologies (ICICCT). :57-61.

Phishing has increased tremendously over last few years and it has become a serious threat to global security and economy. Existing literature dealing with the problem of phishing is scarce. Phishing is a deception technique that uses a combination of technology and social engineering to acquire sensitive information such as online banking passwords, credit card or bank account details [2]. Phishing can be done through emails and websites to collect confidential information. Phishers design fraudulent websites which look similar to the legitimate websites and lure the user to visit the malicious website. Therefore, the users must be aware of malicious websites to protect their sensitive data [1]. But it is very difficult to distinguish between legitimate and fake website especially for nontechnical users [4]. Moreover, phishing sites are growing rapidly. The aim of this paper is to demonstrate phishing detection using fuzzy logic and interpreting results using different defuzzification methods.

2017-12-20
Abdelhamid, N., Thabtah, F., Abdel-jaber, H..  2017.  Phishing detection: A recent intelligent machine learning comparison based on models content and features. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :72–77.

In the last decade, numerous fake websites have been developed on the World Wide Web to mimic trusted websites, with the aim of stealing financial assets from users and organizations. This form of online attack is called phishing, and it has cost the online community and the various stakeholders hundreds of million Dollars. Therefore, effective counter measures that can accurately detect phishing are needed. Machine learning (ML) is a popular tool for data analysis and recently has shown promising results in combating phishing when contrasted with classic anti-phishing approaches, including awareness workshops, visualization and legal solutions. This article investigates ML techniques applicability to detect phishing attacks and describes their pros and cons. In particular, different types of ML techniques have been investigated to reveal the suitable options that can serve as anti-phishing tools. More importantly, we experimentally compare large numbers of ML techniques on real phishing datasets and with respect to different metrics. The purpose of the comparison is to reveal the advantages and disadvantages of ML predictive models and to show their actual performance when it comes to phishing attacks. The experimental results show that Covering approach models are more appropriate as anti-phishing solutions, especially for novice users, because of their simple yet effective knowledge bases in addition to their good phishing detection rate.

Park, A. J., Quadari, R. N., Tsang, H. H..  2017.  Phishing website detection framework through web scraping and data mining. 2017 8th IEEE Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON). :680–684.

Phishers often exploit users' trust on the appearance of a site by using webpages that are visually similar to an authentic site. In the past, various research studies have tried to identify and classify the factors contributing towards the detection of phishing websites. The focus of this research is to establish a strong relationship between those identified heuristics (content-based) and the legitimacy of a website by analyzing training sets of websites (both phishing and legitimate websites) and in the process analyze new patterns and report findings. Many existing phishing detection tools are often not very accurate as they depend mostly on the old database of previously identified phishing websites. However, there are thousands of new phishing websites appearing every year targeting financial institutions, cloud storage/file hosting sites, government websites, and others. This paper presents a framework called Phishing-Detective that detects phishing websites based on existing and newly found heuristics. For this framework, a web crawler was developed to scrape the contents of phishing and legitimate websites. These contents were analyzed to rate the heuristics and their contribution scale factor towards the illegitimacy of a website. The data set collected from Web Scraper was then analyzed using a data mining tool to find patterns and report findings. A case study shows how this framework can be used to detect a phishing website. This research is still in progress but shows a new way of finding and using heuristics and the sum of their contributing weights to effectively and accurately detect phishing websites. Further development of this framework is discussed at the end of the paper.

2017-04-24
Tayal, Kshitij, Ravi, Vadlamani.  2016.  Particle Swarm Optimization Trained Class Association Rule Mining: Application to Phishing Detection. Proceedings of the International Conference on Informatics and Analytics. :13:1–13:8.

Association and classification are two important tasks in data mining. Literature abounds with works that unify these two techniques. This paper presents a new algorithm called Particle Swarm Optimization trained Classification Association Rule Mining (PSOCARM) for associative classification that generates class association rules (CARs) from transactional database by formulating a combinatorial global optimization problem, without having to specify minimal support and confidence unlike other conventional associative classifiers. We devised a new rule pruning scheme in order to reduce the number of rules and increasing the generalization aspect of the classifier. We demonstrated its effectiveness for phishing email and phishing website detection. Our experimental results indicate the superiority of our proposed algorithm with respect to accuracy and the number of rules generated as compared to the state-of-the-art algorithms.