Biblio
The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.
To combat blind in-window attacks against TCP, changes proposed in RFC 5961 have been implemented by Linux since late 2012. While successfully eliminating the old vulnerabilities, the new TCP implementation was reported in August 2016 to have introduced a subtle yet serious security flaw. Assigned CVE-2016-5696, the flaw exploits the challenge ACK rate limiting feature that could allow an off-path attacker to infer the presence/absence of a TCP connection between two arbitrary hosts, terminate such a connection, and even inject malicious payload. In this work, we perform a comprehensive measurement of the impact of the new vulnerability. This includes (1) tracking the vulnerable Internet servers, (2) monitoring the patch behavior over time, (3) picturing the overall security status of TCP stacks at scale. Towards this goal, we design a scalable measurement methodology to scan the Alexa top 1 million websites for almost 6 months. We also present how notifications impact the patching behavior, and compare the result with the Heartbleed and the Debian PRNG vulnerability. The measurement represents a valuable data point in understanding how Internet servers react to serious security flaws in the operating system kernel.
Accurate short-term traffic flow forecasting is of great significance for real-time traffic control, guidance and management. The k-nearest neighbor (k-NN) model is a classic data-driven method which is relatively effective yet simple to implement for short-term traffic flow forecasting. For conventional prediction mechanism of k-NN model, the k nearest neighbors' outputs weighted by similarities between the current traffic flow vector and historical traffic flow vectors is directly used to generate prediction values, so that the prediction results are always not ideal. It is observed that there are always some outliers in k nearest neighbors' outputs, which may have a bad influences on the prediction value, and the local similarities between current traffic flow and historical traffic flows at the current sampling period should have a greater relevant to the prediction value. In this paper, we focus on improving the prediction mechanism of k-NN model and proposed a k-nearest neighbor locally search regression algorithm (k-LSR). The k-LSR algorithm can use locally search strategy to search for optimal nearest neighbors' outputs and use optimal nearest neighbors' outputs weighted by local similarities to forecast short-term traffic flow so as to improve the prediction mechanism of k-NN model. The proposed algorithm is tested on the actual data and compared with other algorithms in performance. We use the root mean squared error (RMSE) as the evaluation indicator. The comparison results show that the k-LSR algorithm is more successful than the k-NN and k-nearest neighbor locally weighted regression algorithm (k-LWR) in forecasting short-term traffic flow, and which prove the superiority and good practicability of the proposed algorithm.
The Internet of Things (IoT) has become ubiquitous in our daily life as billions of devices are connected through the Internet infrastructure. However, the rapid increase of IoT devices brings many non-traditional challenges for system design and implementation. In this paper, we focus on the hardware security vulnerabilities and ultra-low power design requirement of IoT devices. We briefly survey the existing design methods to address these issues. Then we propose an approximate computing based information hiding approach that provides security with low power. We demonstrate that this security primitive can be applied for security applications such as digital watermarking, fingerprinting, device authentication, and lightweight encryption.
Phishers often exploit users' trust on the appearance of a site by using webpages that are visually similar to an authentic site. In the past, various research studies have tried to identify and classify the factors contributing towards the detection of phishing websites. The focus of this research is to establish a strong relationship between those identified heuristics (content-based) and the legitimacy of a website by analyzing training sets of websites (both phishing and legitimate websites) and in the process analyze new patterns and report findings. Many existing phishing detection tools are often not very accurate as they depend mostly on the old database of previously identified phishing websites. However, there are thousands of new phishing websites appearing every year targeting financial institutions, cloud storage/file hosting sites, government websites, and others. This paper presents a framework called Phishing-Detective that detects phishing websites based on existing and newly found heuristics. For this framework, a web crawler was developed to scrape the contents of phishing and legitimate websites. These contents were analyzed to rate the heuristics and their contribution scale factor towards the illegitimacy of a website. The data set collected from Web Scraper was then analyzed using a data mining tool to find patterns and report findings. A case study shows how this framework can be used to detect a phishing website. This research is still in progress but shows a new way of finding and using heuristics and the sum of their contributing weights to effectively and accurately detect phishing websites. Further development of this framework is discussed at the end of the paper.
Physical-layer fingerprinting investigates how features extracted from radio signals can be used to uniquely identify devices. This paper proposes and analyses a novel methodology to fingerprint LoRa devices, which is inspired by recent advances in supervised machine learning and zero-shot image classification. Contrary to previous works, our methodology does not rely on localized and low-dimensional features, such as those extracted from the signal transient or preamble, but uses the entire signal. We have performed our experiments using 22 LoRa devices with 3 different chipsets. Our results show that identical chipsets can be distinguished with 59% to 99% accuracy per symbol, whereas chipsets from different vendors can be fingerprinted with 99% to 100% accuracy per symbol. The fingerprinting can be performed using only inexpensive commercial off-the-shelf software defined radios, and a low sample rate of 1 Msps. Finally, we release all datasets and code pertaining to these experiments to the public domain.
With the robots being applied for more and more fields, security issues attracted more attention. In this paper, we propose that the key center in cloud send a polynomial information to each robot component, the component would put their information into the polynomial to get a group of new keys using in next slot, then check and update its key groups after success in hash. Because of the degree of the polynomial is higher than the number of components, even if an attacker got all the key values of the components, he also cannot restore the polynomial. The information about the keys will be discarded immediately after used, so an attacker cannot obtain the session key used before by the invasion of a component. This article solves the security problems about robotic system caused by cyber-attacks and physical attacks.
As a consequence of the recent development of situational awareness technologies for smart grids, the gathering and analysis of data from multiple sources offer a significant opportunity for enhanced fault diagnosis. In order to achieve improved accuracy for both fault detection and classification, a novel combined data analytics technique is presented and demonstrated in this paper. The proposed technique is based on a segmented approach to Bayesian modelling that provides probabilistic graphical representations of both electrical power and data communication networks. In this manner, the reliability of both the data communication and electrical power networks are considered in order to improve overall power system transmission line fault diagnosis.
In order to achieve more secure and privacy-preserving, a new method of cancelable palmprint template generation scheme using noise data is proposed. Firstly, the random projection is used to reduce the dimension of the palmprint image and the reduced dimension image is normalized. Secondly, a chaotic matrix is produced and it is also normalized. Then the cancelable palmprint feature is generated by comparing the normalized chaotic matrix with reduced dimension image after normalization. Finally, in order to enhance the privacy protection, and then the noise data with independent and identically distributed is added, as the final palmprint features. In this article, the algorithm of adding noise data is analyzed theoretically. Experimental results on the Hong Kong PolyU Palmprint Database verify that random projection and noise are generated in an uncomplicated way, the computational complexity is low. The theoretical analysis of nosie data is consistent with the experimental results. According to the system requirement, on the basis of guaranteeing accuracy, adding a certain amount of noise will contribute to security and privacy protection.
On battery-free IoT devices such as passive RFID tags, it is extremely difficult, if not impossible, to run cryptographic algorithms. Hence physical-layer identification methods are proposed to validate the authenticity of passive tags. However no existing physical-layer authentication method of RFID tags that can defend against the signal replay attack. This paper presents Hu-Fu, a new direction and the first solution of physical layer authentication that is resilient to the signal replay attack, based on the fact of inductive coupling of two adjacent tags. We present the theoretical model and system workflow. Experiments based on our implementation using commodity devices show that Hu-Fu is effective for physical-layer authentication.
The factors that threaten electric power information network are analyzed. Aiming at the weakness of being unable to provide numerical value of risk, this paper presents the evaluation index system, the evaluation model and method of network security based on multilevel fuzzy comprehensive judgment. The steps and method of security evaluation by the synthesis evaluation model are provided. The results show that this method is effective to evaluate the risk of electric power information network.
Digital fingerprinting refers to as method that can assign each copy of an intellectual property (IP) a distinct fingerprint. It was introduced for the purpose of protecting legal and honest IP users. The unique fingerprint can be used to identify the IP or a chip that contains the IP. However, existing fingerprinting techniques are not practical due to expensive cost of creating fingerprints and the lack of effective methods to verify the fingerprints. In the paper, we study a practical scan chain based fingerprinting method, where the digital fingerprint is generated by selecting the Q-SD or Q'-SD connection during the design of scan chains. This method has two major advantages. First, fingerprints are created as a post-silicon procedure and therefore there will be little fabrication overhead. Second, altering the Q-SD or Q'-SD connection style requires the modification of test vectors for each fingerprinted IP in order to maintain the fault coverage. This enables us to verify the fingerprint by inspecting the test vectors without opening up the chip to check the Q-SD or Q'-SD connection styles. We perform experiment on standard benchmarks to demonstrate that our approach has low design overhead. We also conduct security analysis to show that such fingerprints are robust against various attacks.
With the development of Software Defined Networking, its software programmability and openness brings new idea for network security. Therefore, many Software Defined Security Architectures emerged at the right moment. Software Defined Security decouples security control plane and security data plane. In Software Defined Security Architectures, underlying security devices are abstracted as security resources in resource pool, intellectualized and automated security business management and orchestration can be realized through software programming in security control plane. However, network management has been becoming extremely complicated due to expansible network scale, varying network devices, lack of abstraction and heterogeneity of network especially. Therefore, new-type open security devices are needed in SDS Architecture for unified management so that they can be conveniently abstracted as security resources in resource pool. This paper firstly analyses why open security devices are needed in SDS architecture and proposes a method of opening security devices. Considering this new architecture requires a new security scheduling mechanism, this paper proposes a security resource scheduling algorithm which is used for managing and scheduling security resources in resource pool according to user s security demand. The security resource scheduling algorithm aims to allocate a security protection task to a suitable security resource in resource pool so that improving security protection efficiency. In the algorithm, we use BP neural network to predict the execution time of security tasks to improve the performance of the algorithm. The simulation result shows that the algorithm has ideal performance. Finally, a usage scenario is given to illustrate the role of security resource scheduling in software defined security architecture.
The Internet of Things (IoT) increasingly demonstrates its role in smart services, such as smart home, smart grid, smart transportation, etc. However, due to lack of standards among different vendors, existing networked IoT devices (NoTs) can hardly provide enough security. Moreover, it is impractical to apply advanced cryptographic solutions to many NoTs due to limited computing capability and power supply. Inspired by recent advances in IoT demand, in this paper, we develop an IoT security architecture that can protect NoTs in different IoT scenarios. Specifically, the security architecture consists of an auditing module and two network-level security controllers. The auditing module is designed to have a stand-alone intrusion detection system for threat detection in a NoT network cluster. The two network-level security controllers are designed to provide security services from either network resource management or cryptographic schemes regardless of the NoT security capability. We also demonstrate the proposed IoT security architecture with a network based one-hop confidentiality scheme and a cryptography-based secure link mechanism.
Technology specific expert knowledge is often required to analyse security configurations and determine potential vulnerabilities, but it becomes difficult when it is a new technology such as Fog computing. Furthermore, additional knowledge is also required regarding how the security configuration has been constructed in respect to an organisation's security policies. Traditionally, organisations will often manage their access control permissions relative to their employees needs, posing challenges to administrators. This problem is even exacerbated in Fog computing systems where security configurations are implemented on a large amount of devices at the edges of Internet, and the administrators are required to retain adequate knowledge on how to perform complex administrative tasks. In this paper, a novel approach of translating object-based security configurations in to a graph model is presented. A technique is then developed to autonomously identify vulnerabilities and perform security auditing of large systems without the need for expert knowledge. Throughout the paper, access control configuration data is used as a case study, and empirical analysis is performed on synthetically generated access control permissions.
Smart grid systems are characterized by high complexity due to interactions between a traditional passive network and active power electronic components, coupled using communication links. Additionally, automation and information technology plays an important role in order to operate and optimize such cyber-physical energy systems with a high(er) penetration of fluctuating renewable generation and controllable loads. As a result of these developments the validation on the system level becomes much more important during the whole engineering and deployment process, today. In earlier development stages and for larger system configurations laboratory-based testing is not always an option. Due to recent developments, simulation-based approaches are now an appropriate tool to support the development, implementation, and roll-out of smart grid solutions. This paper discusses the current state of simulation-based approaches and outlines the necessary future research and development directions in the domain of power and energy systems.
In a spectrally congested environment or a spectrally contested environment which often occurs in cyber security applications, multiple signals are often mixed together with significant overlap in spectrum. This makes the signal detection and parameter estimation task very challenging. In our previous work, we have demonstrated the feasibility of using a second order spectrum correlation function (SCF) cyclostationary feature to perform mixed signal detection and parameter estimation. In this paper, we present our recent work on software defined radio (SDR) based implementation and demonstration of such mixed signal detection algorithms. Specifically, we have developed a software defined radio based mixed RF signal generator to generate mixed RF signals in real time. A graphical user interface (GUI) has been developed to allow users to conveniently adjust the number of mixed RF signal components, the amplitude, initial time delay, initial phase offset, carrier frequency, symbol rate, modulation type, and pulse shaping filter of each RF signal component. This SDR based mixed RF signal generator is used to transmit desirable mixed RF signals to test the effectiveness of our developed algorithms. Next, we have developed a software defined radio based mixed RF signal detector to perform the mixed RF signal detection. Similarly, a GUI has been developed to allow users to easily adjust the center frequency and bandwidth of band of interest, perform time domain analysis, frequency domain analysis, and cyclostationary domain analysis.
Web-Based applications are becoming more increasingly technically complex and sophisticated. The very nature of their feature-rich design and their capability to collate, process, and disseminate information over the Internet or from within an intranet makes them a popular target for attack. According to Open Web Application Security Project (OWASP) Top Ten Cheat sheet-2017, SQL Injection Attack is at peak among online attacks. This can be attributed primarily to lack of awareness on software security. Developing effective SQL injection detection approaches has been a challenge in spite of extensive research in this area. In this paper, we propose a signature based SQL injection attack detection framework by integrating fingerprinting method and Pattern Matching to distinguish genuine SQL queries from malicious queries. Our framework monitors SQL queries to the database and compares them against a dataset of signatures from known SQL injection attacks. If the fingerprint method cannot determine the legitimacy of query alone, then the Aho Corasick algorithm is invoked to ascertain whether attack signatures appear in the queries. The initial experimental results of our framework indicate the approach can identify wide variety of SQL injection attacks with negligible impact on performance.
Along with the growing popularisation of Cloud Computing. Cloud storage technology has been paid more and more attention as an emerging network storage technology which is extended and developed by cloud computing concepts. Cloud computing environment depends on user services such as high-speed storage and retrieval provided by cloud computing system. Meanwhile, data security is an important problem to solve urgently for cloud storage technology. In recent years, There are more and more malicious attacks on cloud storage systems, and cloud storage system of data leaking also frequently occurred. Cloud storage security concerns the user's data security. The purpose of this paper is to achieve data security of cloud storage and to formulate corresponding cloud storage security policy. Those were combined with the results of existing academic research by analyzing the security risks of user data in cloud storage and approach a subject of the relevant security technology, which based on the structural characteristics of cloud storage system.
How to construct good linear codes is an important problem in coding theory. This paper considers the construction of linear codes from functions with two variables, presents a class of two-weight and three-weight ternary linear codes and employs the Gauss sums and exponential sums to determine the parameters and weight distribution of these codes. Linear codes with few weights have applications in consumer electronics, communication and date storage systems. Linear codes with two weights have applications in strongly regular graphs and linear codes with three weights can be applied in association schemes.
Building the Internet of Things requires deploying a huge number of objects with full or limited connectivity to the Internet. Given that these objects are exposed to attackers and generally not secured-by-design, it is essential to be able to update them, to patch their vulnerabilities and to prevent hackers from enrolling them into botnets. Ideally, the update infrastructure should implement the CIA triad properties, i.e., confidentiality, integrity and availability. In this work, we investigate how the use of a blockchain infrastructure can meet these requirements, with a focus on availability. In addition, we propose a peer-to-peer mechanism, to spread updates between objects that have limited access to the Internet. Finally, we give an overview of our ongoing prototype implementation.