Biblio
Nowadays, cyber attacks affect many institutions and individuals, and they result in a serious financial loss for them. Phishing Attack is one of the most common types of cyber attacks which is aimed at exploiting people's weaknesses to obtain confidential information about them. This type of cyber attack threats almost all internet users and institutions. To reduce the financial loss caused by this type of attacks, there is a need for awareness of the users as well as applications with the ability to detect them. In the last quarter of 2016, Turkey appears to be second behind China with an impact rate of approximately 43% in the Phishing Attack Analysis report between 45 countries. In this study, firstly, the characteristics of this type of attack are explained, and then a machine learning based system is proposed to detect them. In the proposed system, some features were extracted by using Natural Language Processing (NLP) techniques. The system was implemented by examining URLs used in Phishing Attacks before opening them with using some extracted features. Many tests have been applied to the created system, and it is seen that the best algorithm among the tested ones is the Random Forest algorithm with a success rate of 89.9%.
NoSQL databases have become popular with enterprises due to their scalable and flexible storage management of big data. Nevertheless, their popularity also brings up security concerns. Most NoSQL databases lacked secure data encryption, relying on developers to implement cryptographic methods at application level or middleware layer as a wrapper around the database. While this approach protects the integrity of data, it increases the difficulty of executing queries. We were motivated to design a system that not only provides NoSQL databases with the necessary data security, but also supports the execution of query over encrypted data. Furthermore, how to exploit the distributed fashion of NoSQL databases to deliver high performance and scalability with massive client accesses is another important challenge. In this research, we introduce Crypt-NoSQL, the first prototype to support execution of query over encrypted data on NoSQL databases with high performance. Three different models of Crypt-NoSQL were proposed and performance was evaluated with Yahoo! Cloud Service Benchmark (YCSB) considering an enormous number of clients. Our experimental results show that Crypt-NoSQL can process queries over encrypted data with high performance and scalability. A guidance of establishing service level agreement (SLA) for Crypt-NoSQL as a cloud service is also proposed.
Reliable and scalable storage systems are key to cloud-based applications. In cloud storage, users store their data on remote servers rather than their local computers. Secure storage is used to ensure the safety of data in clouds. As more and more users rely on third-party cloud vendors to store their data, concerns have arisen among users and cloud providers. Encryption-based approaches are commonly used in secure storage systems. Data are encrypted and stored on persistent storage like disks and flash memories. When data are needed by the users, they are decrypted and accessed by the users. This way of managing data hurts the scalability and throughput of cloud systems. In the meantime, cloud systems have to perform fault-tolerance strategies on data, which also brings performance deduction. The combination of these issues cause a high price for data security in cloud systems. Aware of such issues. we propose methods to reduce the overhead of secure storage while guaranteeing the safeness of data.
During the development and expansion of Internet of Things (IoT), main challenges needing to be addressed are the heterogeneity, interoperability, scalability, flexibility and security of IoT applications. In this paper, we view IoT as a large-scale distributed cyber-physical-social complex network. From that perspective, the above challenges are analyzed. Then, we propose a distributed multi-agent architecture to unify numbers of different IoT applications by designing the software-defined sensors, auctuators and controllers. Furthermore, we analyze the proposed architecture and clarify why and how it can tackle the heterogeneity of IoT applications, enable them to interoperate with each other, make it efficient to introduce new applications, and enhance the flexibility and security of different applications. Finally, the use case of smart home with multiple applications is applied to verify the feasibility of the proposed solution for IoT architecture.
This paper describes a data driven approach to studying the science of cyber security (SoS). It argues that science is driven by data. It then describes issues and approaches towards the following three aspects: (i) Data Driven Science for Attack Detection and Mitigation, (ii) Foundations for Data Trustworthiness and Policy-based Sharing, and (iii) A Risk-based Approach to Security Metrics. We believe that the three aspects addressed in this paper will form the basis for studying the Science of Cyber Security.
The performance, dependability, and security of cloud service systems are vital for the ongoing operation, control, and support. Thus, controlled improvement in service requires a comprehensive analysis and systematic identification of the fundamental underlying constituents of cloud using a rigorous discipline. In this paper, we introduce a framework which helps identifying areas for potential cloud service enhancements. A cloud service cannot be completed if there is a failure in any of its underlying resources. In addition, resources are kept offline for scheduled maintenance. We use redundant resources to mitigate the impact of failures/maintenance for ensuring performance and dependability; which helps enhancing security as well. For example, at least 4 replicas are required to defend the intrusion of a single instance or a single malicious attack/fault as defined by Byzantine Fault Tolerance (BFT). Data centers with high performance, dependability, and security are outsourced to the cloud computing environment with greater flexibility of cost of owing the computing infrastructure. In this paper, we analyze the effectiveness of redundant resource usage in terms of dependability metric and cost of service deployment based on the priority of service requests. The trade-off among dependability, cost, and security under different redundancy schemes are characterized through the comprehensive analytical models.
Cyber infrastructures are highly vulnerable to intrusions and other threats. The main challenges in cloud computing are failure of data centres and recovery of lost data and providing a data security system. This paper has proposed a Virtualization and Data Recovery to create a virtual environment and recover the lost data from data servers and agents for providing data security in a cloud environment. A Cloud Manager is used to manage the virtualization and to handle the fault. Erasure code algorithm is used to recover the data which initially separates the data into n parts and then encrypts and stores in data servers. The semi trusted third party and the malware changes made in data stored in data centres can be identified by Artificial Intelligent methods using agents. Java Agent Development Framework (JADE) is a tool to develop agents and facilitates the communication between agents and allows the computing services in the system. The framework designed and implemented in the programming language JAVA as gateway or firewall to recover the data loss.
In this paper a method of monostatic RCS measuring in real conditions for complex shaped objects is proposed. The basic idea of the method is to provide measuring in near field zone for different parts of the object (fragments) separately. This technique is titled "decomposition method". After such measurements all RCS data are summed and one can obtain the average RCS of investigated object. Such method is much more accessible in comparison with natural measurements in far field zone. In this paper the decomposition method is tested numerically. For this a model of complex shape object (tank T-90) is divided into the fragments for some direction of view. It is shown that the sum of RCS of the fragments is close to the full object RCS for corresponding direction.
Fifty-four percent of the global email traffic in October 2016 was spam and phishing messages. Those emails were commonly sent from compromised email accounts. Previous research has primarily focused on detecting incoming junk mail but not locally generated spam messages. State-of-the-art spam detection methods generally require the content of the email to be able to classify it as either spam or a regular message. This content is not available within encrypted messages or is prohibited due to data privacy. The object of the research presented is to detect an anomaly with the Origin-Destination Delivery Notification method, which is based on the geographical origin and destination as well as the Delivery Status Notification of the remote SMTP server without the knowledge of the email content. The proposed method detects an abused account after a few transferred emails; it is very flexible and can be adjusted for every environment and requirement.
As a plethora of wearable devices are being introduced, significant concerns exist on the privacy and security of personal data stored on these devices. Expanding on recent works of using electrocardiogram (ECG) as a modality for biometric authentication, in this work, we investigate the possibility of using personal ECG signals as the individually unique source for physical unclonable function (PUF), which eventually can be used as the key for encryption and decryption engines. We present new signal processing and machine learning algorithms that learn and extract maximally different ECG features for different individuals and minimally different ECG features for the same individual over time. Experimental results with a large 741-subject in-house ECG database show that the distributions of the intra-subject (same person) Hamming distance of extracted ECG features and the inter-subject Hamming distance have minimal overlap. 256-b random numbers generated from the ECG features of 648 (out of 741) subjects pass the NIST randomness tests.
The best practice to prevent Cross Site Scripting (XSS) attacks is to apply encoders to sanitize untrusted data. To balance security and functionality, encoders should be applied to match the web page context, such as HTML body, JavaScript, and style sheets. A common programming error is the use of a wrong encoder to sanitize untrusted data, leaving the application vulnerable. We present a security unit testing approach to detect XSS vulnerabilities caused by improper encoding of untrusted data. Unit tests for the XSS vulnerability are automatically constructed out of each web page and then evaluated by a unit test execution framework. A grammar-based attack generator is used to automatically generate test inputs. We evaluate our approach on a large open source medical records application, demonstrating that we can detect many 0-day XSS vulnerabilities with very low false positives, and that the grammar-based attack generator has better test coverage than industry best practices.
Reliable detection of intrusion is the basis of safety in cognitive radio networks (CRNs). So far, few scholars applied intrusion detection systems (IDSs) to combat intrusion against CRNs. In order to improve the performance of intrusion detection in CRNs, a distributed intrusion detection scheme has been proposed. In this paper, a method base on Dempster-Shafer's (D-S) evidence theory to detect intrusion in CRNs is put forward, in which the detection data and credibility of different local IDS Agent is combined by D-S in the cooperative detection center, so that different local detection decisions are taken into consideration in the final decision. The effectiveness of the proposed scheme is verified by simulation, and the results reflect a noticeable performance improvement between the proposed scheme and the traditional method.
Many IoT devices are part of fixed critical infrastructure, where the mere act of moving an IoT device may constitute an attack. Moving pressure, chemical and radiation sensors in a factory can have devastating consequences. Relocating roadside speed sensors, or smart meters without knowledge of command and control center can similarly wreck havoc. Consequently, authenticating geolocation of IoT devices is an important problem. Unfortunately, an IoT device itself may be compromised by an adversary. Hence, location information from the IoT device cannot be trusted. Thus, we have to rely on infrastructure to obtain a proximal location. Infrastructure routers may similarly be compromised. Therefore, there must be a way to authenticate trusted routers remotely. Unfortunately, IP packets may be blocked, hijacked or forged by an adversary. Therefore IP packets are not trustworthy either. Thus, we resort to covert channels for authenticating Internet packet routers as an intermediate step towards proximal geolocation of IoT devices. Several techniques have been proposed in the literature to obtain the geolocation of an edge device, but it has been shown that a knowledgeable adversary can circumvent these techniques. In this paper, we survey the state-of-the-art geolocation techniques and corresponding adversarial countermeasures to evade geolocation to justify the use of covert channels on networks. We propose a technique for determining proximal geolocation using covert channel. Challenges and directions for future work are also explored.
Encryption is often not sufficient to secure communication, since it does not hide that communication takes place or who is communicating with whom. Covert channels hide the very existence of communication enabling individuals to communicate secretly. Previous work proposed a covert channel hidden inside multi-player first person shooter online game traffic (FPSCC). FPSCC has a low bit rate, but it is practically impossible to eliminate other than by blocking the overt game trac. This paper shows that with knowledge of the channel’s encoding and using machine learning techniques, FPSCC can be detected with an accuracy of 95% or higher.
Multi-agent simulations are useful for exploring collective patterns of individual behavior in social, biological, economic, network, and physical systems. However, there is no provenance support for multi-agent models (MAMs) in a distributed setting. To this end, we introduce ProvMASS, a novel approach to capture provenance of MAMs in a distributed memory by combining inter-process identification, lightweight coordination of in-memory provenance storage, and adaptive provenance capture. ProvMASS is built on top of the Multi-Agent Spatial Simulation (MASS) library, a framework that combines multi-agent systems with large-scale fine-grained agent-based models, or MAMs. Unlike other environments supporting MAMs, MASS parallelizes simulations with distributed memory, where agents and spatial data are shared application resources. We evaluate our approach with provenance queries to support three use cases and performance measures. Initial results indicate that our approach can support various provenance queries for MAMs at reasonable performance overhead.
Today's control systems such as smart environments have the ability to adapt to their environment in order to achieve a set of objectives (e.g., comfort, security and energy savings). This is done by changing their behaviour upon the occurrence of specific events. Building such a system requires to design and implement autonomic loops that collect events and measurements, make decisions and execute the corresponding actions.The design and the implementation of such loops are made difficult by several factors: the complexity of systems with multiple objectives, the risk of conflicting decisions between multiple loops, the inconsistencies that can result from communication errors and hardware failures and the heterogeneity of the devices.In this paper, we propose a design framework for reliable and self-adaptive systems, where multiple autonomic loops can be composed into complex managers, and we consider its application to smart environments. We build upon the proposed framework a generic autonomic loop which combines an automata-based controller that makes correct and coherent decisions, a transactional execution mechanism that avoids inconsistencies, and an abstraction layer that hides the heterogeneity of the devices.We propose patterns for composition of such loops, in parallel, coordinated, and hierarchically, with benefits from the leveraging of automata-based modular constructs, that provides for guarantees on the correct behaviour of the controlled system. We implement our framework with the transactional middleware LINC, the reactive language Heptagon/BZR and the abstraction framework PUTUTU. A case study in the field of building automation is presented to illustrate the proposed framework.
Recently, the increase of interconnectivity has led to a rising amount of IoT enabled devices in botnets. Such botnets are currently used for large scale DDoS attacks. To keep track with these malicious activities, Honeypots have proven to be a vital tool. We developed and set up a distributed and highly-scalable WAN Honeypot with an attached backend infrastructure for sophisticated processing of the gathered data. For the processed data to be understandable we designed a graphical frontend that displays all relevant information that has been obtained from the data. We group attacks originating in a short period of time in one source as sessions. This enriches the data and enables a more in-depth analysis. We produced common statistics like usernames, passwords, username/password combinations, password lengths, originating country and more. From the information gathered, we were able to identify common dictionaries used for brute-force login attacks and other more sophisticated statistics like login attempts per session and attack efficiency.
Botnets have been a serious threat to the Internet security. With the constant sophistication and the resilience of them, a new trend has emerged, shifting botnets from the traditional desktop to the mobile environment. As in the desktop domain, detecting mobile botnets is essential to minimize the threat that they impose. Along the diverse set of strategies applied to detect these botnets, the ones that show the best and most generalized results involve discovering patterns in their anomalous behavior. In the mobile botnet field, one way to detect these patterns is by analyzing the operation parameters of this kind of applications. In this paper, we present an anomaly-based and host-based approach to detect mobile botnets. The proposed approach uses machine learning algorithms to identify anomalous behaviors in statistical features extracted from system calls. Using a self-generated dataset containing 13 families of mobile botnets and legitimate applications, we were able to test the performance of our approach in a close-to-reality scenario. The proposed approach achieved great results, including low false positive rates and high true detection rates.
This paper design three distribution devices for the strong and smart grid, respectively are novel transformer with function of dc bias restraining, energy-saving contactor and controllable reactor with adjustable intrinsic magnetic state based on nanocomposite magnetic material core. The magnetic performance of this material was analyzed and the relationship between the remanence and coercivity was determined. The magnetization and demagnetization circuit for the nanocomposite core has been designed based on three-phase rectification circuit combined with a capacitor charging circuit. The remanence of the nanocomposite core can neutralize the dc bias flux occurred in transformer main core, can pull in the movable core of the contactor instead of the traditional fixed core and adjust the saturation degree of the reactor core. The electromagnetic design of the three distribution devices was conducted and the simulation, experiment results verify correctness of the design which provides intelligent and energy-saving power equipment for the smart power grids safe operation.
Attacks against websites are increasing rapidly with the expansion of web services. An increasing number of diversified web services make it difficult to prevent such attacks due to many known vulnerabilities in websites. To overcome this problem, it is necessary to collect the most recent attacks using decoy web honeypots and to implement countermeasures against malicious threats. Web honeypots collect not only malicious accesses by attackers but also benign accesses such as those by web search crawlers. Thus, it is essential to develop a means of automatically identifying malicious accesses from mixed collected data including both malicious and benign accesses. Specifically, detecting vulnerability scanning, which is a preliminary process, is important for preventing attacks. In this study, we focused on classification of accesses for web crawling and vulnerability scanning since these accesses are too similar to be identified. We propose a feature vector including features of collective accesses, e.g., intervals of request arrivals and the dispersion of source port numbers, obtained with multiple honeypots deployed in different networks for classification. Through evaluation using data collected from 37 honeypots in a real network, we show that features of collective accesses are advantageous for vulnerability scanning and crawler classification.
Intrusion detection has been an active field of research for more than 35 years. Numerous systems had been built based on the two fundamental detection principles, knowledge-based and behavior-based detection. Anyway, having a look at day-to-day news about data breaches and successful attacks, detection effectiveness is still limited. Even more, heavy-weight intrusion detection systems cannot be installed in every endangered environment. For example, Industrial Control Systems are typically utilized for decades, charging off huge investments of companies. Thus, some of these systems have been in operation for years, but were designed afore without security in mind. Even worse, as systems often have connections to other networks and even the Internet nowadays, an adequate protection is mandatory, but integrating intrusion detection can be extremely difficult - or even impossible to date. We propose a new lightweight current-based IDS which is using a difficult to manipulate measurement base and verifiable ground truth. Focus of our system is providing intrusion detection for ICS and SCADA on a low-priced base, easy to integrate. Dr. WATTson, a prototype implemented based on our concept provides high detection and low false alarm rates.
Many malware families utilize domain generation algorithms (DGAs) to establish command and control (C&C) connections. While there are many methods to pseudorandomly generate domains, we focus in this paper on detecting (and generating) domains on a per-domain basis which provides a simple and flexible means to detect known DGA families. Recent machine learning approaches to DGA detection have been successful on fairly simplistic DGAs, many of which produce names of fixed length. However, models trained on limited datasets are somewhat blind to new DGA variants. In this paper, we leverage the concept of generative adversarial networks to construct a deep learning based DGA that is designed to intentionally bypass a deep learning based detector. In a series of adversarial rounds, the generator learns to generate domain names that are increasingly more difficult to detect. In turn, a detector model updates its parameters to compensate for the adversarially generated domains. We test the hypothesis of whether adversarially generated domains may be used to augment training sets in order to harden other machine learning models against yet-to-be-observed DGAs. We detail solutions to several challenges in training this character-based generative adversarial network. In particular, our deep learning architecture begins as a domain name auto-encoder (encoder + decoder) trained on domains in the Alexa one million. Then the encoder and decoder are reassembled competitively in a generative adversarial network (detector + generator), with novel neural architectures and training strategies to improve convergence.
The traditional text classification methods usually follow this process: first, a sentence can be considered as a bag of words (BOW), then transformed into sentence feature vector which can be classified by some methods, such as maximum entropy (ME), Naive Bayes (NB), support vector machines (SVM), and so on. However, when these methods are applied to text classification, we usually can not obtain an ideal result. The most important reason is that the semantic relations between words is very important for text categorization, however, the traditional method can not capture it. Sentiment classification, as a special case of text classification, is binary classification (positive or negative). Inspired by the sentiment analysis, we use a novel deep learning-based recurrent neural networks (RNNs)model for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure). In this paper, the feature of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. RNNs are now widely used and the network structure of RNNs determines that it can easily process the sequence data. We preprocess short messages, extract typical features from existing security and non-security short messages via word2vec, and classify short messages through RNNs which accept a fixed-sized vector as input and produce a fixed-sized vector as output. The experimental results show that the RNNs model achieves an average 92.7% accuracy which is higher than SVM.