Biblio | CPS-VO

Khadilkar, Kunal, Kulkarni, Siddhivinayak, Bone, Poojarani. 2018. Plagiarism Detection Using Semantic Knowledge Graphs. 2018 Fourth International Conference on Computing Communication Control and Automation (ICCUBEA). :1—6.

Every day, huge amounts of unstructured text is getting generated. Most of this data is in the form of essays, research papers, patents, scholastic articles, book chapters etc. Many plagiarism softwares are being developed to be used in order to reduce the stealing and plagiarizing of Intellectual Property (IP). Current plagiarism softwares are mainly using string matching algorithms to detect copying of text from another source. The drawback of some of such plagiarism softwares is their inability to detect plagiarism when the structure of the sentence is changed. Replacement of keywords by their synonyms also fails to be detected by these softwares. This paper proposes a new method to detect such plagiarism using semantic knowledge graphs. The method uses Named Entity Recognition as well as semantic similarity between sentences to detect possible cases of plagiarism. The doubtful cases are visualized using semantic Knowledge Graphs for thorough analysis of authenticity. Rules for active and passive voice have also been considered in the proposed methodology.

Despotovski, Filip, Gusev, Marjan, Zdraveski, Vladimir. 2018. Parallel Implementation of K-Nearest-Neighbors for Face Recognition. 2018 26th Telecommunications Forum (℡FOR). :1—4.

Face recognition is a fast-expanding field of research. Countless classification algorithms have found use in face recognition, with more still being developed, searching for better performance and accuracy. For high-dimensional data such as images, the K-Nearest-Neighbours classifier is a tempting choice. However, it is very computationally-intensive, as it has to perform calculations on all items in the stored dataset for each classification it makes. Fortunately, there is a way to speed up the process by performing some of the calculations in parallel. We propose a parallel CUDA implementation of the KNN classifier and then compare it to a serial implementation to demonstrate its performance superiority.

Shah, Mujahid, Ahmed, Sheeraz, Saeed, Khalid, Junaid, Muhammad, Khan, Hamayun, Ata-ur-rehman. 2019. Penetration Testing Active Reconnaissance Phase – Optimized Port Scanning With Nmap Tool. 2019 2nd International Conference on Computing, Mathematics and Engineering Technologies (iCoMET). :1—6.

Reconnaissance might be the longest phase, sometimes take weeks or months. The black hat makes use of passive information gathering techniques. Once the attacker has sufficient statistics, then the attacker starts the technique of scanning perimeter and internal network devices seeking out open ports and related services. In this paper we are showing traffic accountability and time to complete the specific task during reconnaissance phase active scanning with nmap tool and proposed strategies that how to deal with large volumes of hosts and conserve network traffic as well as time of the specific task.

Desmoulins, Nicolas, Diop, Aïda, Rafflé, Yvan, Traoré, Jacques, Gratesac, Josselin. 2019. Practical Anonymous Attestation-based Pseudonym Schemes for Vehicular Networks. 2019 IEEE Vehicular Networking Conference (VNC). :1—8.

Vehicular communication systems increase traffic efficiency and safety by allowing vehicles to share safety-related information and location-based services. Pseudonym schemes are the standard solutions providing driver/vehicle anonymity, whilst enforcing vehicle accountability in case of liability issues. State-of-the-art PKI-based pseudonym schemes present scalability issues, notably due to the centralized architecture of certificate-based solutions. The first Direct Anonymous Attestation (DAA)-based pseudonym scheme was introduced at VNC 2017, providing a decentralized approach to the pseudonym generation and update phases. The DAA-based construction leverages the properties of trusted computing, allowing vehicles to autonomously generate their own pseudonyms by using a (resource constrained) Trusted Hardware Module or Component (TC). This proposition however requires the TC to delegate part of the (heavy) pseudonym generation computations to the (more powerful) vehicle's On-Board Unit (OBU), introducing security and privacy issues in case the OBU becomes compromised. In this paper, we introduce a novel pseudonym scheme based on a variant of DAA, namely a pre-DAA-based pseudonym scheme. All secure computations in the pre-DAA pseudonym lifecycle are executed by the secure element, thus creating a secure enclave for pseudonym generation, update, and revocation. We instantiate vehicle-to-everything (V2X) with our pre-DAA solution, thus ensuring user anonymity and user-controlled traceability within the vehicular network. In addition, the pre-DAA-based construction transfers accountability from the vehicle to the user, thus complying with the many-to-many driver/vehicle relation. We demonstrate the efficiency of our solution with a prototype implementation on a standard Javacard (acting as a TC), showing that messages can be anonymously signed and verified in less than 50 ms.

Tabiban, Azadeh, Majumdar, Suryadipta, Wang, Lingyu, Debbabi, Mourad. 2018. PERMON: An OpenStack Middleware for Runtime Security Policy Enforcement in Clouds. 2018 IEEE Conference on Communications and Network Security (CNS). :1–7.

To ensure the accountability of a cloud environment, security policies may be provided as a set of properties to be enforced by cloud providers. However, due to the sheer size of clouds, it can be challenging to provide timely responses to all the requests coming from cloud users at runtime. In this paper, we design and implement a middleware, PERMON, as a pluggable interface to OpenStack for intercepting and verifying the legitimacy of user requests at runtime, while leveraging our previous work on proactive security verification to improve the efficiency. We describe detailed implementation of the middleware and demonstrate its usefulness through a use case.

singh, Kunal, Mathai, K. James. 2019. Performance Comparison of Intrusion Detection System Between Deep Belief Network (DBN)Algorithm and State Preserving Extreme Learning Machine (SPELM) Algorithm. 2019 IEEE International Conference on Electrical, Computer and Communication Technologies (ICECCT). :1–7.

This paper work is focused on Performance comparison of intrusion detection system between DBN Algorithm and SPELM Algorithm. Researchers have used this new algorithm SPELM to perform experiments in the area of face recognition, pedestrian detection, and for network intrusion detection in the area of cyber security. The scholar used the proposed State Preserving Extreme Learning Machine(SPELM) algorithm as machine learning classifier and compared it's performance with Deep Belief Network (DBN) algorithm using NSL KDD dataset. The NSL- KDD dataset has four lakhs of data record; out of which 40% of data were used for training purposes and 60% data used in testing purpose while calculating the performance of both the algorithms. The experiment as performed by the scholar compared the Accuracy, Precision, recall and Computational Time of existing DBN algorithm with proposed SPELM Algorithm. The findings have show better performance of SPELM; when compared its accuracy of 93.20% as against 52.8% of DBN algorithm;69.492 Precision of SPELM as against 66.836 DBN and 90.8 seconds of Computational time taken by SPELM as against 102 seconds DBN Algorithm.

Saccente, Nicholas, Dehlinger, Josh, Deng, Lin, Chakraborty, Suranjan, Xiong, Yin. 2019. Project Achilles: A Prototype Tool for Static Method-Level Vulnerability Detection of Java Source Code Using a Recurrent Neural Network. 2019 34th IEEE/ACM International Conference on Automated Software Engineering Workshop (ASEW). :114—121.

Software has become an essential component of modern life, but when software vulnerabilities threaten the security of users, new ways of analyzing for software security must be explored. Using the National Institute of Standards and Technology's Juliet Java Suite, containing thousands of examples of defective Java methods for a variety of vulnerabilities, a prototype tool was developed implementing an array of Long-Short Term Memory Recurrent Neural Networks to detect vulnerabilities within source code. The tool employs various data preparation methods to be independent of coding style and to automate the process of extracting methods, labeling data, and partitioning the dataset. The result is a prototype command-line utility that generates an n-dimensional vulnerability prediction vector. The experimental evaluation using 44,495 test cases indicates that the tool can achieve an accuracy higher than 90% for 24 out of 29 different types of CWE vulnerabilities.

Zhou, Zichao, An, Changqing, Yang, Jiahai. 2018. A Programmable Network Management Architecture for Address Driven Network. 2018 10th International Conference on Communications, Circuits and Systems (ICCCAS). :199–206.

The operation and management of network is facing increasing complexities brought by the evolution of network protocols and the demands of rapid service delivery. In this paper, we propose a programmable network management architecture, which manages network based on NETCONF protocol and provides REST APIs to upper layer so that further programming can be done based on the APIs to implement flexible management. Functions of devices can be modeled based on YANG language, and the models can be translated into REST APIs. We apply it to the management of ADN (Address Driven Network), an innovative network architecture proposed by Tsinghua University to inhibit IP spoofing, improve network security and provide high service quality. We model the functions of ADN based on YANG language, and implement the network management functions based on the REST APIs. We deploy and evaluate it in a laboratory environment. Test result shows that the programmable network management architecture is flexible to implement management for new network services.

Shuvro, Rezoan A., Das, Pankaz, Hayat, Majeed M., Talukder, Mitun. 2019. Predicting Cascading Failures in Power Grids using Machine Learning Algorithms. 2019 North American Power Symposium (NAPS). :1—6.

Although there has been notable progress in modeling cascading failures in power grids, few works included using machine learning algorithms. In this paper, cascading failures that lead to massive blackouts in power grids are predicted and classified into no, small, and large cascades using machine learning algorithms. Cascading-failure data is generated using a cascading failure simulator framework developed earlier. The data set includes the power grid operating parameters such as loading level, level of load shedding, the capacity of the failed lines, and the topological parameters such as edge betweenness centrality and the average shortest distance for numerous combinations of two transmission line failures as features. Then several machine learning algorithms are used to classify cascading failures. Further, linear regression is used to predict the number of failed transmission lines and the amount of load shedding during a cascade based on initial feature values. This data-driven technique can be used to generate cascading failure data set for any real-world power grids and hence, power-grid engineers can use this approach for cascade data generation and hence predicting vulnerabilities and enhancing robustness of the grid.

Chen, Lin, William Atwood, J.. 2018. Performance Evaluation for Secure Internet Group Management Protocol and Group Security Association Management Protocol. 2018 IEEE Canadian Conference on Electrical Computer Engineering (CCECE). :1—5.

Multicast distribution employs the model of many-to-many so that it is a more efficient way of data delivery compared to traditional one-to-one unicast distribution, which can benefit many applications such as media streaming. However, the lack of security features in its nature makes multicast technology much less popular in an open environment such as the Internet. Internet Service Providers (ISPs) take advantage of IP multicast technology's high efficiency of data delivery to provide Internet Protocol Television (IPTV) to their users. But without the full control on their networks, ISPs cannot collect revenue for the services they provide. Secure Internet Group Management Protocol (SIGMP), an extension of Internet Group Management Protocol (IGMP), and Group Security Association Management Protocol (GSAM), have been proposed to enforce receiver access control at the network level of IP multicast. In this paper, we analyze operational details and issues of both SIGMP and GSAM. An examination of the performance of both protocols is also conducted.

Rahman, Lamiya, Adan, Jannatul, Nahid-AI-Masood, Deeba, Shohana Rahman. 2018. Performance Analysis of Floating Buoy Point Absorber and Oscillating Surge Wave Energy Converters in Onshore and Offshore Locations. 2018 10th International Conference on Electrical and Computer Engineering (ICECE). :233—236.

The aim of this paper is to explore the performance of two well-known wave energy converters (WECs) namely Floating Buoy Point Absorber (FBPA) and Oscillating Surge (OS) in onshore and offshore locations. To achieve clean energy targets by reducing greenhouse gas emissions, integration of renewable energy resources is continuously increasing all around the world. In addition to widespread renewable energy source such as wind and solar photovoltaic (PV), wave energy extracted from ocean is becoming more tangible day by day. In the literature, a number of WEC devices are reported. However, further investigations are still needed to better understand the behaviors of FBPA WEC and OS WEC under irregular wave conditions in onshore and offshore locations. Note that being surrounded by Bay of Bengal, Bangladesh has huge scope of utilizing wave power. To this end, FBPA WEC and OS WEC are simulated using the typical onshore and offshore wave height and wave period of the coastal area of Bangladesh. Afterwards, performances of the aforementioned two WECs are compared by analyzing their power output.

Khan, Muhammad Imran, Foley, Simon N., O'Sullivan, Barry. 2019. PriDe: A Quantitative Measure of Privacy-Loss in Interactive Querying Settings. 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1–5.

This paper presents, PriDe, a model to measure the deviation of an analyst's (user) querying behaviour from normal querying behaviour. The deviation is measured in terms of privacy, that is to say, how much of the privacy loss has incurred due to this shift in querying behaviour. The shift is represented in terms of a score - a privacy-loss score, the higher the score the more the loss in privacy. Querying behaviour of analysts are modelled using n-grams of SQL query and subsequently, behavioural profiles are constructed. Profiles are then compared in terms of privacy resulting in a quantified score indicating the privacy loss.

Kundu, Suprateek, Suthaharan, Shan. 2019. Privacy-Preserving Predictive Model Using Factor Analysis for Neuroscience Applications. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :67–73.

The purpose of this article is to present an algorithm which maximizes prediction accuracy under a linear regression model while preserving data privacy. This approach anonymizes the data such that the privacy of the original features is fully guaranteed, and the deterioration in predictive accuracy using the anonymized data is minimal. The proposed algorithm employs two stages: the first stage uses a probabilistic latent factor approach to anonymize the original features into a collection of lower dimensional latent factors, while the second stage uses an optimization algorithm to tune the anonymized data further, in a way which ensures a minimal loss in prediction accuracy under the predictive approach specified by the user. We demonstrate the advantages of our approach via numerical studies and apply our method to high-dimensional neuroimaging data where the goal is to predict the behavior of adolescents and teenagers based on functional magnetic resonance imaging (fMRI) measurements.

Kundu, Suprateek, Suthaharan, Shan. 2019. Privacy-Preserving Predictive Model Using Factor Analysis for Neuroscience Applications. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :67–73.

The purpose of this article is to present an algorithm which maximizes prediction accuracy under a linear regression model while preserving data privacy. This approach anonymizes the data such that the privacy of the original features is fully guaranteed, and the deterioration in predictive accuracy using the anonymized data is minimal. The proposed algorithm employs two stages: the first stage uses a probabilistic latent factor approach to anonymize the original features into a collection of lower dimensional latent factors, while the second stage uses an optimization algorithm to tune the anonymized data further, in a way which ensures a minimal loss in prediction accuracy under the predictive approach specified by the user. We demonstrate the advantages of our approach via numerical studies and apply our method to high-dimensional neuroimaging data where the goal is to predict the behavior of adolescents and teenagers based on functional magnetic resonance imaging (fMRI) measurements.

To, Hien, Shahabi, Cyrus, Xiong, Li. 2018. Privacy-Preserving Online Task Assignment in Spatial Crowdsourcing with Untrusted Server. 2018 IEEE 34th International Conference on Data Engineering (ICDE). :833–844.

With spatial crowdsourcing (SC), requesters outsource their spatiotemporal tasks (tasks associated with location and time) to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. However, current solutions require the locations of the workers and/or the tasks to be disclosed to untrusted parties (SC server) for effective assignments of tasks to workers. In this paper we propose a framework for assigning tasks to workers in an online manner without compromising the location privacy of workers and tasks. We perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations. We investigate both analytical and empirical models for quantifying the worker-task pair reachability and propose task assignment strategies that strike a balance among various metrics such as the number of completed tasks, worker travel distance and system overhead. Extensive experiments on real-world datasets show that our proposed techniques result in minimal disclosure of task locations and no disclosure of worker locations without significantly sacrificing the total number of assigned tasks.

To, Hien, Shahabi, Cyrus, Xiong, Li. 2018. Privacy-Preserving Online Task Assignment in Spatial Crowdsourcing with Untrusted Server. 2018 IEEE 34th International Conference on Data Engineering (ICDE). :833–844.

With spatial crowdsourcing (SC), requesters outsource their spatiotemporal tasks (tasks associated with location and time) to a set of workers, who will perform the tasks by physically traveling to the tasks' locations. However, current solutions require the locations of the workers and/or the tasks to be disclosed to untrusted parties (SC server) for effective assignments of tasks to workers. In this paper we propose a framework for assigning tasks to workers in an online manner without compromising the location privacy of workers and tasks. We perturb the locations of both tasks and workers based on geo-indistinguishability and then devise techniques to quantify the probability of reachability between a task and a worker, given their perturbed locations. We investigate both analytical and empirical models for quantifying the worker-task pair reachability and propose task assignment strategies that strike a balance among various metrics such as the number of completed tasks, worker travel distance and system overhead. Extensive experiments on real-world datasets show that our proposed techniques result in minimal disclosure of task locations and no disclosure of worker locations without significantly sacrificing the total number of assigned tasks.

Wang, Chong Xiao, Song, Yang, Tay, Wee Peng. 2018. PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.

We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.

Wang, Chong Xiao, Song, Yang, Tay, Wee Peng. 2018. PRESERVING PARAMETER PRIVACY IN SENSOR NETWORKS. 2018 IEEE Global Conference on Signal and Information Processing (GlobalSIP). :1316–1320.

We consider the problem of preserving the privacy of a set of private parameters while allowing inference of a set of public parameters based on observations from sensors in a network. We assume that the public and private parameters are correlated with the sensor observations via a linear model. We define the utility loss and privacy gain functions based on the Cramér-Rao lower bounds for estimating the public and private parameters, respectively. Our goal is to minimize the utility loss while ensuring that the privacy gain is no less than a predefined privacy gain threshold, by allowing each sensor to perturb its own observation before sending it to the fusion center. We propose methods to determine the amount of noise each sensor needs to add to its observation under the cases where prior information is available or unavailable.

Sule, Rupali, Chaudhari, Sangita. 2018. Preserving Location Privacy in Geosocial Applications using Error Based Transformation. 2018 International Conference on Smart City and Emerging Technology (ICSCET). :1–4.

Geo-social applications deal with constantly sharing user's current geographic information in terms of location (Latitude and Longitude). Such application can be used by many people to get information about their surrounding with the help of their friend's locations and their recommendations. But without any privacy protection, these systems can be easily misused by tracking the users. We are proposing Error Based Transformation (ERB) approach for location transformation which provides significantly improved location privacy without adding uncertainty in to query results or relying on strong assumptions about server security. The key insight is to apply secure user-specific, distance-preserving coordinate transformations to all location data shared with the server. Only the friends of a user can get exact co-ordinates by applying inverse transformation with secret key shared with them. Servers can evaluate all location queries correctly on transformed data. ERB privacy mechanism guarantee that servers are unable to see or infer actual location data from the transformed data. ERB privacy mechanism is successful against a powerful adversary model where prototype measurements used to show that it provides with very little performance overhead making it suitable for today's mobile device.

Sule, Rupali, Chaudhari, Sangita. 2018. Preserving Location Privacy in Geosocial Applications using Error Based Transformation. 2018 International Conference on Smart City and Emerging Technology (ICSCET). :1–4.

Geo-social applications deal with constantly sharing user's current geographic information in terms of location (Latitude and Longitude). Such application can be used by many people to get information about their surrounding with the help of their friend's locations and their recommendations. But without any privacy protection, these systems can be easily misused by tracking the users. We are proposing Error Based Transformation (ERB) approach for location transformation which provides significantly improved location privacy without adding uncertainty in to query results or relying on strong assumptions about server security. The key insight is to apply secure user-specific, distance-preserving coordinate transformations to all location data shared with the server. Only the friends of a user can get exact co-ordinates by applying inverse transformation with secret key shared with them. Servers can evaluate all location queries correctly on transformed data. ERB privacy mechanism guarantee that servers are unable to see or infer actual location data from the transformed data. ERB privacy mechanism is successful against a powerful adversary model where prototype measurements used to show that it provides with very little performance overhead making it suitable for today's mobile device.

Oest, Adam, Safaei, Yeganeh, Doupé, Adam, Ahn, Gail-Joon, Wardman, Brad, Tyers, Kevin. 2019. PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists. 2019 IEEE Symposium on Security and Privacy (SP). :1344—1361.

Phishing attacks have reached record volumes in recent years. Simultaneously, modern phishing websites are growing in sophistication by employing diverse cloaking techniques to avoid detection by security infrastructure. In this paper, we present PhishFarm: a scalable framework for methodically testing the resilience of anti-phishing entities and browser blacklists to attackers' evasion efforts. We use PhishFarm to deploy 2,380 live phishing sites (on new, unique, and previously-unseen .com domains) each using one of six different HTTP request filters based on real phishing kits. We reported subsets of these sites to 10 distinct anti-phishing entities and measured both the occurrence and timeliness of native blacklisting in major web browsers to gauge the effectiveness of protection ultimately extended to victim users and organizations. Our experiments revealed shortcomings in current infrastructure, which allows some phishing sites to go unnoticed by the security community while remaining accessible to victims. We found that simple cloaking techniques representative of real-world attacks- including those based on geolocation, device type, or JavaScript- were effective in reducing the likelihood of blacklisting by over 55% on average. We also discovered that blacklisting did not function as intended in popular mobile browsers (Chrome, Safari, and Firefox), which left users of these browsers particularly vulnerable to phishing attacks. Following disclosure of our findings, anti-phishing entities are now better able to detect and mitigate several cloaking techniques (including those that target mobile users), and blacklisting has also become more consistent between desktop and mobile platforms- but work remains to be done by anti-phishing entities to ensure users are adequately protected. Our PhishFarm framework is designed for continuous monitoring of the ecosystem and can be extended to test future state-of-the-art evasion techniques used by malicious websites.

liu, Shidong, Bu, Xiande. 2019. Performance Modeling and Assessment of Unified Video Surveillance System Based on Ubiquitous SG-eIoT. 2019 IEEE International Conference on Energy Internet (ICEI). :238–243.

Video surveillance system is an important application system on the ubiquitous SG-eIoT. A comparative analysis of the traditional video surveillance scheme and the unified video surveillance solution in the eIoT environment is made. Network load and service latency parameters under the two schemes are theoretically modeled and simulated. Combined with the simulation results, the corresponding suggestions for the access of video terminals in the ubiquitous eIoT are given.

Kikuchi, Masato, Okubo, Takao. 2019. Power of Communication Behind Extreme Cybersecurity Incidents. 2019 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech). :315—319.

There are increasing threats for cyberspace. This paper tries to identify how extreme cybersecurity incidents occur based on the scenario of a targeted attack through emails. Knowledge on how extreme cybersecurity incidents occur helps in identifying the key points on how they can be prevented from occurring. The model based on system thinking approach to the understanding how communication influences entities and how tiny initiating events scale up into extreme events provides a condensed figure of the cyberspace and surrounding threats. By taking cyberspace layers and characteristics of cyberspace identified by this model into consideration, it predicts most suitable risk mitigations.

Huang, Yongjie, Qin, Jinghui, Wen, Wushao. 2019. Phishing URL Detection Via Capsule-Based Neural Network. 2019 IEEE 13th International Conference on Anti-counterfeiting, Security, and Identification (ASID). :22—26.

As a cyber attack which leverages social engineering and other sophisticated techniques to steal sensitive information from users, phishing attack has been a critical threat to cyber security for a long time. Although researchers have proposed lots of countermeasures, phishing criminals figure out circumventions eventually since such countermeasures require substantial manual feature engineering and can not detect newly emerging phishing attacks well enough, which makes developing an efficient and effective phishing detection method an urgent need. In this work, we propose a novel phishing website detection approach by detecting the Uniform Resource Locator (URL) of a website, which is proved to be an effective and efficient detection approach. To be specific, our novel capsule-based neural network mainly includes several parallel branches wherein one convolutional layer extracts shallow features from URLs and the subsequent two capsule layers generate accurate feature representations of URLs from the shallow features and discriminate the legitimacy of URLs. The final output of our approach is obtained by averaging the outputs of all branches. Extensive experiments on a validated dataset collected from the Internet demonstrate that our approach can achieve competitive performance against other state-of-the-art detection methods while maintaining a tolerable time overhead.

[Anonymous]. 2019. PhishFarm: A Scalable Framework for Measuring the Effectiveness of Evasion Techniques against Browser Phishing Blacklists - IEEE Conference Publication.

Phishing attacks have reached record volumes in recent years. Simultaneously, modern phishing websites are growing in sophistication by employing diverse cloaking techniques to avoid detection by security infrastructure. In this paper, we present PhishFarm: a scalable framework for methodically testing the resilience of anti-phishing entities and browser blacklists to attackers' evasion efforts. We use PhishFarm to deploy 2,380 live phishing sites (on new, unique, and previously-unseen .com domains) each using one of six different HTTP request filters based on real phishing kits. We reported subsets of these sites to 10 distinct anti-phishing entities and measured both the occurrence and timeliness of native blacklisting in major web browsers to gauge the effectiveness of protection ultimately extended to victim users and organizations. Our experiments revealed shortcomings in current infrastructure, which allows some phishing sites to go unnoticed by the security community while remaining accessible to victims. We found that simple cloaking techniques representative of real-world attacks- including those based on geolocation, device type, or JavaScript- were effective in reducing the likelihood of blacklisting by over 55% on average. We also discovered that blacklisting did not function as intended in popular mobile browsers (Chrome, Safari, and Firefox), which left users of these browsers particularly vulnerable to phishing attacks. Following disclosure of our findings, anti-phishing entities are now better able to detect and mitigate several cloaking techniques (including those that target mobile users), and blacklisting has also become more consistent between desktop and mobile platforms- but work remains to be done by anti-phishing entities to ensure users are adequately protected. Our PhishFarm framework is designed for continuous monitoring of the ecosystem and can be extended to test future state-of-the-art evasion techniques used by malicious websites.