Biblio
Nowadays, an increasing number of IoT vendors have complied and deployed third-party code bases across different architectures. Therefore, to avoid the firmware from being affected by the same known vulnerabilities, searching known vulnerabilities in binary firmware across different architectures is more crucial than ever. However, most of existing vulnerability search methods are limited to the same architecture, there are only a few researches on cross-architecture cases, of which the accuracy is not high. In this paper, to promote the accuracy of existing cross-architecture vulnerability search methods, we propose a new approach based on Support Vector Machine (SVM) and Attributed Control Flow Graph (ACFG) to search known vulnerability in firmware across different architectures at function level. We employ a known vulnerability function to recognize suspicious functions in other binary firmware. First, considering from the internal and external characteristics of the functions, we extract the function level features and basic-block level features of the functions to be inspected. Second, we employ SVM to recognize a little part of suspicious functions based on function level features. After the preliminary screening, we compute the graph similarity between the vulnerability function and suspicious functions based on their ACFGs. We have implemented our approach CVSSA, and employed the training samples to train the model with previous knowledge to improve the accuracy. We also search several vulnerabilities in the real-world firmware images, the experimental results show that CVSSA can be applied to the realistic scenarios.
In this paper, we propose a new regularization scheme for the well-known Support Vector Machine (SVM) classifier that operates on the training sample level. The proposed approach is motivated by the fact that Maximum Margin-based classification defines decision functions as a linear combination of the selected training data and, thus, the variations on training sample selection directly affect generalization performance. We show that the exploitation of the proposed regularization scheme is well motivated and intuitive. Experimental results show that the proposed regularization scheme outperforms standard SVM in human action recognition tasks as well as classical recognition problems.
Code smells may be introduced in software due to market rivalry, work pressure deadline, improper functioning, skills or inexperience of software developers. Code smells indicate problems in design or code which makes software hard to change and maintain. Detecting code smells could reduce the effort of developers, resources and cost of the software. Many researchers have proposed different techniques like DETEX for detecting code smells which have limited precision and recall. To overcome these limitations, a new technique named as SVMCSD has been proposed for the detection of code smells, based on support vector machine learning technique. Four code smells are specified namely God Class, Feature Envy, Data Class and Long Method and the proposed technique is validated on two open source systems namely ArgoUML and Xerces. The accuracy of SVMCSD is found to be better than DETEX in terms of two metrics, precision and recall, when applied on a subset of a system. While considering the entire system, SVMCSD detect more occurrences of code smells than DETEX.
Deep learning model has been widely studied and proven to achieve high accuracy in various pattern recognition tasks, especially in image recognition. However, due to its non-linear architecture and high-dimensional inputs, its ill-posedness [1] towards adversarial perturbations-small deliberately crafted perturbations on the input will lead to completely different outputs, has also attracted researchers' attention. This work takes the traffic sign recognition system on the self-driving car as an example, and aims at designing an additional mechanism to improve the robustness of the recognition system. It uses a machine learning model which learns the results of the deep learning model's predictions, with human feedback as labels and provides the credibility of current prediction. The mechanism makes use of both the input image and the recognition result as sample space, querying a human user the True/False of current classification result the least number of times, and completing the task of detecting adversarial attacks.
Cyber-security threats are a growing concern in networked environments. The development of Intrusion Detection Systems (IDSs) is fundamental in order to provide extra level of security. We have developed an unsupervised anomaly-based IDS that uses statistical techniques to conduct the detection process. Despite providing many advantages, anomaly-based IDSs tend to generate a high number of false alarms. Machine Learning (ML) techniques have gained wide interest in tasks of intrusion detection. In this work, Support Vector Machine (SVM) is deemed as an ML technique that could complement the performance of our IDS, providing a second line of detection to reduce the number of false alarms, or as an alternative detection technique. We assess the performance of our IDS against one-class and two-class SVMs, using linear and non- linear forms. The results that we present show that linear two-class SVM generates highly accurate results, and the accuracy of the linear one-class SVM is very comparable, and it does not need training datasets associated with malicious data. Similarly, the results evidence that our IDS could benefit from the use of ML techniques to increase its accuracy when analysing datasets comprising of non- homogeneous features.
Explosive naval mines pose a threat to ocean and sea faring vessels, both military and civilian. This work applies deep neural network (DNN) methods to the problem of detecting minelike objects (MLO) on the seafloor in side-scan sonar imagery. We explored how the DNN depth, memory requirements, calculation requirements, and training data distribution affect detection efficacy. A visualization technique (class activation map) was incorporated that aids a user in interpreting the model's behavior. We found that modest DNN model sizes yielded better accuracy (98%) than very simple DNN models (93%) and a support vector machine (78%). The largest DNN models achieved textless;1% efficacy increase at a cost of a 17x increase of trainable parameter count and computation requirements. In contrast to DNNs popularized for many-class image recognition tasks, the models for this task require far fewer computational resources (0.3% of parameters), and are suitable for embedded use within an autonomous unmanned underwater vehicle.
Vehicular ad hoc networks (VANETs) are taking more attention from both the academia and the automotive industry due to a rapid development of wireless communication technologies. And with this development, vehicles called connected cars are increasingly being equipped with more sensors, processors, storages, and communication devices as they start to provide both infotainment and safety services through V2X communication. Such increase of vehicles is also related to the rise of security attacks and potential security threats. In a vehicular environment, security is one of the most important issues and it must be addressed before VANETs can be widely deployed. Conventional VANETs have some unique characteristics such as high mobility, dynamic topology, and a short connection time. Since an attacker can launch any unexpected attacks, it is difficult to predict these attacks in advance. To handle this problem, we propose collaborative security attack detection mechanism in a software-defined vehicular networks that uses multi-class support vector machine (SVM) to detect various types of attacks dynamically. We compare our security mechanism to existing distributed approach and present simulation results. The results demonstrate that the proposed security mechanism can effectively identify the types of attacks and achieve a good performance regarding high precision, recall, and accuracy.
Malware damages computers and the threat is a serious problem. Malware can be detected by pattern matching method or dynamic heuristic method. However, it is difficult to detect all new malware subspecies perfectly by existing methods. In this paper, we propose a new method which automatically detects new malware subspecies by static analysis of execution files and machine learning. The method can distinguish malware from benignware and it can also classify malware subspecies into malware families. We combine static analysis of execution files with machine learning classifier and natural language processing by machine learning. Information of DLL Import, assembly code and hexdump are acquired by static analysis of execution files of malware and benignware to create feature vectors. Paragraph vectors of information by static analysis of execution files are created by machine learning of PV-DBOW model for natural language processing. Support vector machine and classifier of k-nearest neighbor algorithm are used in our method, and the classifier learns paragraph vectors of information by static analysis. Unknown execution files are classified into malware or benignware by pre-learned SVM. Moreover, malware subspecies are also classified into malware families by pre-learned k-nearest. We evaluate the accuracy of the classification by experiments. We think that new malware subspecies can be effectively detected by our method without existing methods for malware analysis such as generic method and dynamic heuristic method.
Software components, which are vulnerable to being exploited, need to be identified and patched. Employing any prevention techniques designed for the purpose of detecting vulnerable software components in early stages can reduce the expenses associated with the software testing process significantly and thus help building a more reliable and robust software system. Although previous studies have demonstrated the effectiveness of adapting prediction techniques in vulnerability detection, the feasibility of those techniques is limited mainly because of insufficient training data sets. This paper proposes a prediction technique targeting at early identification of potentially vulnerable software components. In the proposed scheme, the potentially vulnerable components are viewed as mislabeled data that may contain true but not yet observed vulnerabilities. The proposed hybrid technique combines the supports vector machine algorithm and ensemble learning strategy to better identify potential vulnerable components. The proposed vulnerability detection scheme is evaluated using some Java Android applications. The results demonstrated that the proposed hybrid technique could identify potentially vulnerable classes with high precision and relatively acceptable accuracy and recall.
Imposters gain unauthorized access to biometric recognition systems using fake biometric data of the legitimate user termed as spoofing. Spoofing of face recognition systems is done by photographs, 3D models and videos of the user. Attack video contains noise from the acquisition process. In this work, we use noise residual content of the video in order to detect spoofed videos. We take advantage of wavelet transform for representing the noise video. Samples of the noise video, termed as visual rhythm image is created for each video. Local Binary Pattern (LBP) and uniform Local Binary Pattern (LBPu2) are extracted from the visual rhythm image followed by classification using Support Vector Machine (SVM). Large size of video from which a number of frames are used for analysis results in huge execution timing. In this work the spoof detection algorithm is applied on various levels of subsections of the video frames resulting in reduced execution timing with reasonable detection accuracies.
In software quality estimation research, software defect prediction is a key topic. A defect prediction model is generally constructed using a variety of software attributes and each attribute may have positive, negative or neutral effect on a specific model. Selection of an optimal set of attributes for model development remains a vital yet unexplored issue. In this paper, we have introduced a new feature space transformation process with a normalization technique to improve the defect prediction accuracy. We proposed a feature space transformation technique and classify the instances using Support Vector Machine (SVM) with its histogram intersection kernel. The proposed method is evaluated using the data sets from NASA metric data repository and its application demonstrates acceptable accuracy.
The Domain Name System (DNS) is a critically fundamental element in the internet technology as it translates domain names into corresponding IP addresses. The DNS queries and responses are UDP (User Datagram Protocol) based. DNS name servers are constantly facing threats of DNS amplification attacks. DNS amplification attack is one of the major Distributed Denial of Service (DDoS) attacks, in DNS. The DNS amplification attack victimized huge business and financial companies and organizations by giving disturbance to the customers. In this paper, a mechanism is proposed to detect such attacks coming from the compromised machines. We analysed DNS traffic packet comparatively based on the Machine Learning Classification algorithms such as Decision Tree (TREE), Multi Layer Perceptron (MLP), Naïve Bayes (NB) and Support Vector Machine (SVM) to classify the DNS traffics into normal and abnormal. In this approach attribute selection algorithms such as Information Gain, Gain Ratio and Chi Square are used to achieve optimal feature subset. In the experimental result it shows that the Decision Tree achieved 99.3% accuracy. This model gives highest accuracy and performance as compared to other Machine Learning algorithms.
Nowadays, sentiment analysis methods become more and more popular especially with the proliferation of social media platform users number. In the same context, this paper presents a sentiment analysis approach which can faithfully translate the sentimental orientation of Arabic Twitter posts, based on a novel data representation and machine learning techniques. The proposed approach applied a wide range of features: lexical, surface-form, syntactic, etc. We also made use of lexicon features inferred from two Arabic sentiment words lexicons. To build our supervised sentiment analysis system, we use several standard classification methods (Support Vector Machines, K-Nearest Neighbour, Naïve Bayes, Decision Trees, Random Forest) known by their effectiveness over such classification issues. In our study, Support Vector Machines classifier outperforms other supervised algorithms in Arabic Twitter sentiment analysis. Via an ablation experiments, we show the positive impact of lexicon based features on providing higher prediction performance.
Most of the supervised classification algorithms are proposed to classify newly seen instances based on their learned label space. However, in the case of data streams, concept-evolution is inevitable. In this paper we propose a support vector based approach for classification beyond the learned label space in data streams with regard to other challenges in data streams like concept-drift and infinite-length. We maintain the boundaries of observed classes through the stream by utilizing a support vector based method (SVDD). Newly arrived instances located outside these boundaries will be analyzed by constructing neighborhood graph to detect the emergence of a class beyond the learned label space (novel class). Our method is more accurate to model intricate-shape class boundaries than existing method since it utilizes support vector data description method. Dynamically maintaining boundaries by shrinking, enlarging and merging spheres in the kernel space, helps our method to adapt both dramatic and gradual changes of underlying distribution of data, and also be more memory efficient than the existing methods. Conducted experiments on both real and synthetic benchmark data sets show the superiority of the proposed method over the state-of-the-art methods in this area.
Software defects will lead to software running error and system crashes. In order to detect software defect as early as possible at early stage of software development, a series of machine learning approaches have been studied and applied to predict defects in software modules. Unfortunately, the imbalanceof software defect datasets brings great challenge to software defect prediction model training. In this paper, a new manifold learning based subspace learning algorithm, Discriminative Locality Alignment(DLA), is introduced into software defects prediction. Experimental results demonstrate that DLA is consistently superior to LDA (Linear Discriminant Analysis) and PCA (Principal Component Analysis) in terms of discriminate information extraction and prediction performance. In addition, DLA reveals some attractive intrinsic properties for numeric calculation, e.g. it can overcome the matrix singular problem and small sample size problem in software defect prediction.
This paper proposes a novel deep two-view approach to learn features from both visible and thermal images and leverage the commonality among visible and thermal images for facial expression recognition from visible images. The thermal images are used as privileged information, which is required only during training to help visible images learn better features and classifier. Specifically, we first learn a deep model for visible images and thermal images respectively, and use the learned feature representations to train SVM classifiers for expression classification. We then jointly refine the deep models as well as the SVM classifiers for both thermal images and visible images by imposing the constraint that the outputs of the SVM classifiers from two views are similar. Therefore, the resulting representations and classifiers capture the inherent connections among visible facial image, infrared facial image and target expression labels, and hence improve the recognition performance for facial expression recognition from visible images during testing. Experimental results on the benchmark expression database demonstrate the effectiveness of our proposed method.
Explicit non-linear transformations of existing steganalysis features are shown to boost their ability to detect steganography in combination with existing simple classifiers, such as the FLD-ensemble. The non-linear transformations are learned from a small number of cover features using Nyström approximation on pilot vectors obtained with kernelized PCA. The best performance is achieved with the exponential form of the Hellinger kernel, which improves the detection accuracy by up to 2-3% for spatial-domain contentadaptive steganography. Since the non-linear map depends only on the cover source and its learning has a low computational complexity, the proposed approach is a practical and low cost method for boosting the accuracy of existing detectors built as binary classifiers. The map can also be used to significantly reduce the feature dimensionality (by up to factor of ten) without performance loss with respect to the non-transformed features.
Digital information security is the field of information technology which deal with all about identification and protection of information. Whereas, identification of the threat of any Intrusion Detection System (IDS) in the most challenging phase. Threat detection become most promising because rest of the IDS system phase depends on the solely on "what is identified". In this view, a multilayered framework has been discussed which handles the underlying features for the identification of various attack (DoS, R2L, U2R, Probe). The experiments validates the use SVM with genetic approach is efficient.
The aim of this research is to advance the user active authentication using keystroke dynamics. Through this research, we assess the performance and influence of various keystroke features on keystroke dynamics authentication systems. In particular, we investigate the performance of keystroke features on a subset of most frequently used English words. The performance of four features such as i) key duration, ii) flight time latency, iii) diagraph time latency, and iv) word total time duration are analyzed. Two machine learning techniques are employed for assessing keystroke authentications. The selected classification methods are support vector machine (SVM), and k-nearest neighbor classifier (K-NN). The logged experimental data are captured for 28 users. The experimental results show that key duration time offers the best performance result among all four keystroke features, followed by word total time.
This paper presents a framework to identify the authors of Thai online messages. The identification is based on 53 writing attributes and the selected algorithms are support vector machine (SVM) and C4.5 decision tree. Experimental results indicate that the overall accuracies achieved by the SVM and the C4.5 were 79% and 75%, respectively. This difference was not statistically significant (at 95% confidence interval). As for the performance of identifying individual authors, in some cases the SVM was clearly better than the C4.5. But there were also other cases where both of them could not distinguish one author from another.
- « first
- ‹ previous
- 1
- 2
- 3
- 4