Biblio
The NEREIDA wave generation power plant installed in Mutriku, Spain is a multiple Oscillating Water Column (OWC) plant. The power takeoff consists of a Wells turbine coupled to a Doubly Fed Induction Generator (DFIG). The stalling behavior present in the Wells turbine limits the generated power. This paper presents the modeling and a Harmony Search Algorithm-based airflow control of the OWC. The Harmony Search Algorithm (HSA) is proposed to help overcome the limitations of a traditionally tuned PID. An investigation between HSA-tuned controller and the traditionally tuned controller has been performed. Results of the controlled and uncontrolled plant prove the effectiveness of the airflow control and the superiority of the HSA-tuned controller.
Reverse engineering is a manually intensive but necessary technique for understanding the inner workings of new malware, finding vulnerabilities in existing systems, and detecting patent infringements in released software. An assembly clone search engine facilitates the work of reverse engineers by identifying those duplicated or known parts. However, it is challenging to design a robust clone search engine, since there exist various compiler optimization options and code obfuscation techniques that make logically similar assembly functions appear to be very different. A practical clone search engine relies on a robust vector representation of assembly code. However, the existing clone search approaches, which rely on a manual feature engineering process to form a feature vector for an assembly function, fail to consider the relationships between features and identify those unique patterns that can statistically distinguish assembly functions. To address this problem, we propose to jointly learn the lexical semantic relationships and the vector representation of assembly functions based on assembly code. We have developed an assembly code representation learning model \textbackslashemphAsm2Vec. It only needs assembly code as input and does not require any prior knowledge such as the correct mapping between assembly functions. It can find and incorporate rich semantic relationships among tokens appearing in assembly code. We conduct extensive experiments and benchmark the learning model with state-of-the-art static and dynamic clone search approaches. We show that the learned representation is more robust and significantly outperforms existing methods against changes introduced by obfuscation and optimizations.
Aiming at the problem that one-dimensional parameter optimization in insider threat detection using deep learning will lead to unsatisfactory overall performance of the model, an insider threat detection method based on adaptive optimization DBN by grid search is designed. This method adaptively optimizes the learning rate and the network structure which form the two-dimensional grid, and adaptively selects a set of optimization parameters for threat detection, which optimizes the overall performance of the deep learning model. The experimental results show that the method has good adaptability. The learning rate of the deep belief net is optimized to 0.6, the network structure is optimized to 6 layers, and the threat detection rate is increased to 98.794%. The training efficiency and the threat detection rate of the deep belief net are improved.
The method of choice the control parameters of a complex system based on estimates of the risks is proposed. The procedure of calculating the estimates of risks intended for a choice of rational managing directors of influences by an allocation of the group of the operating factors for the set criteria factor is considered. The purpose of choice of control parameters of the complex system is the minimization of an estimate of the risk of the functioning of the system by mean of a solution of a problem of search of an extremum of the function of many variables. The example of a choice of the operating factors in the sphere of intangible assets is given.
Cloud Storage Service(CSS) provides unbounded, robust file storage capability and facilitates for pay-per-use and collaborative work to end users. But due to security issues like lack of confidentiality, malicious insiders, it has not gained wide spread acceptance to store sensitive information. Researchers have proposed proxy re-encryption schemes for secure data sharing through cloud. Due to advancement of computing technologies and advent of quantum computing algorithms, security of existing schemes can be compromised within seconds. Hence there is a need for designing security schemes which can be quantum computing resistant. In this paper, a secure file sharing scheme through cloud storage using proxy re-encryption technique has been proposed. The proposed scheme is proven to be chosen ciphertext secure(CCA) under hardness of ring-LWE, Search problem using random oracle model. The proposed scheme outperforms the existing CCA secure schemes in-terms of re-encryption time and decryption time for encrypted files which results in an efficient file sharing scheme through cloud storage.
The recently developed deep belief network (DBN) has been shown to be an effective methodology for solving time series forecasting problems. However, the performance of DBN is seriously depended on the reasonable setting of hyperparameters. At present, random search, grid search and Bayesian optimization are the most common methods of hyperparameters optimization. As an alternative, a state-of-the-art derivative-free optimizer-negative correlation search (NCS) is adopted in this paper to decide the sizes of DBN and learning rates during the training processes. A comparative analysis is performed between the proposed method and other popular techniques in the time series forecasting experiment based on two types of time series datasets. Experiment results statistically affirm the efficiency of the proposed model to obtain better prediction results compared with conventional neural network models.
The problem of optimal attack path analysis is one of the hotspots in network security. Many methods are available to calculate an optimal attack path, such as Q-learning algorithm, heuristic algorithms, etc. But most of them have shortcomings. Some methods can lead to the problem of path loss, and some methods render the result un-comprehensive. This article proposes an improved Monte Carlo Graph Search algorithm (IMCGS) to calculate optimal attack paths in target network. IMCGS can avoid the problem of path loss and get comprehensive results quickly. IMCGS is divided into two steps: selection and backpropagation, which is used to calculate optimal attack paths. A weight vector containing priority, host connection number, CVSS value is proposed for every host in an attack path. This vector is used to calculate the evaluation value, the total CVSS value and the average CVSS value of a path in the target network. Result for a sample test network is presented to demonstrate the capabilities of the proposed algorithm to generate optimal attack paths in one single run. The results obtained by IMCGS show good performance and are compared with Ant Colony Optimization Algorithm (ACO) and k-zero attack graph.
This paper describes the work done to design a SoC platform for real-time on-line pattern search in TCP packets for Deep Packet Inspection (DPI) applications. The platform is based on a Xilinx Zynq programmable SoC and includes an accelerator that implements a pattern search engine that extends the original Boyer-Moore algorithm with timing and logical rules, that produces a very complex set of rules. Also, the platform implements different modes of operation, including SIMD and MISD parallelism, which can be configured on-line. The platform is scalable depending of the analysis requirement up to 8 Gbps. High-Level synthesis and platform based design methodologies have been used to reduce the time to market of the completed system.
Accurate short-term traffic flow forecasting is of great significance for real-time traffic control, guidance and management. The k-nearest neighbor (k-NN) model is a classic data-driven method which is relatively effective yet simple to implement for short-term traffic flow forecasting. For conventional prediction mechanism of k-NN model, the k nearest neighbors' outputs weighted by similarities between the current traffic flow vector and historical traffic flow vectors is directly used to generate prediction values, so that the prediction results are always not ideal. It is observed that there are always some outliers in k nearest neighbors' outputs, which may have a bad influences on the prediction value, and the local similarities between current traffic flow and historical traffic flows at the current sampling period should have a greater relevant to the prediction value. In this paper, we focus on improving the prediction mechanism of k-NN model and proposed a k-nearest neighbor locally search regression algorithm (k-LSR). The k-LSR algorithm can use locally search strategy to search for optimal nearest neighbors' outputs and use optimal nearest neighbors' outputs weighted by local similarities to forecast short-term traffic flow so as to improve the prediction mechanism of k-NN model. The proposed algorithm is tested on the actual data and compared with other algorithms in performance. We use the root mean squared error (RMSE) as the evaluation indicator. The comparison results show that the k-LSR algorithm is more successful than the k-NN and k-nearest neighbor locally weighted regression algorithm (k-LWR) in forecasting short-term traffic flow, and which prove the superiority and good practicability of the proposed algorithm.
The rapid development of Internet has resulted in massive information overloading recently. These information is usually represented by high-dimensional feature vectors in many related applications such as recognition, classification and retrieval. These applications usually need efficient indexing and search methods for such large-scale and high-dimensional database, which typically is a challenging task. Some efforts have been made and solved this problem to some extent. However, most of them are implemented in a single machine, which is not suitable to handle large-scale database.In this paper, we present a novel data index structure and nearest neighbor search algorithm implemented on Apache Spark. We impose a grid on the database and index data by non-empty grid cells. This grid-based index structure is simple and easy to be implemented in parallel. Moreover, we propose to build a scalable KNN graph on the grids, which increase the efficiency of this index structure by a low cost in parallel implementation. Finally, experiments are conducted in both public databases and synthetic databases, showing that the proposed methods achieve overall high performance in both efficiency and accuracy.
Binary embedding is an effective way for nearest neighbor (NN) search as binary code is storage efficient and fast to compute. It tries to convert real-value signatures into binary codes while preserving similarity of the original data. However, it greatly decreases the discriminability of original signatures due to the huge loss of information. In this paper, we propose a novel method double-bit quantization and weighting (DBQW) to solve the problem by mapping each dimension to double-bit binary code and assigning different weights according to their spatial relationship. The proposed method is applicable to a wide variety of embedding techniques, such as SH, PCA-ITQ and PCA-RR. Experimental comparisons on two datasets show that DBQW for NN search can achieve remarkable improvements in query accuracy compared to original binary embedding methods.
The growing volume of data and its increasing complexity require even more efficient and faster information retrieval techniques. Approximate nearest neighbor search algorithms based on hashing were proposed to query high-dimensional datasets due to its high retrieval speed and low storage cost. Recent studies promote the use of Convolutional Neural Network (CNN) with hashing techniques to improve the search accuracy. However, there are challenges to solve in order to find a practical and efficient solution to index CNN features, such as the need for a heavy training process to achieve accurate query results and the critical dependency on data-parameters. In this work we execute exhaustive experiments in order to compare recent methods that are able to produces a better representation of the data space with a less computational cost for a better accuracy by computing the best data-parameter values for optimal sub-space projection exploring the correlations among CNN feature attributes using fractal theory. We give an overview of these different techniques and present our comparative experiments for data representation and retrieval performance.
Testing and fixing Web Application Firewalls (WAFs) are two relevant and complementary challenges for security analysts. Automated testing helps to cost-effectively detect vulnerabilities in a WAF by generating effective test cases, i.e., attacks. Once vulnerabilities have been identified, the WAF needs to be fixed by augmenting its rule set to filter attacks without blocking legitimate requests. However, existing research suggests that rule sets are very difficult to understand and too complex to be manually fixed. In this paper, we formalise the problem of fixing vulnerable WAFs as a combinatorial optimisation problem. To solve it, we propose an automated approach that combines machine learning with multi-objective genetic algorithms. Given a set of legitimate requests and bypassing SQL injection attacks, our approach automatically infers regular expressions that, when added to the WAF's rule set, prevent many attacks while letting legitimate requests go through. Our empirical evaluation based on both open-source and proprietary WAFs shows that the generated filter rules are effective at blocking previously identified and successful SQL injection attacks (recall between 54.6% and 98.3%), while triggering in most cases no or few false positives (false positive rate between 0% and 2%).
Swarm Intelligence (SI) algorithms, such as Fish School Search (FSS), are well known as useful tools that can be used to achieve a good solution in a reasonable amount of time for complex optimization problems. And when problems increase in size and complexity, some increase in population size or number of iterations might be needed in order to achieve a good solution. In extreme cases, the execution time can be huge and other approaches, such as parallel implementations, might help to reduce it. This paper investigates the relation and trade off involving these three aspects in SI algorithms, namely population size, number of iterations, and problem complexity. The results with a parallel implementations of FSS show that increasing the population size is beneficial for finding good solutions. However, we observed an asymptotic behavior of the results, i.e. increasing the population over a certain threshold only leads to slight improvements.