Biblio
An acoustic fingerprint is a condensed and powerful digital signature of an audio signal which is used for audio sample identification. A fingerprint is the pattern of a voice or audio sample. A large number of algorithms have been developed for generating such acoustic fingerprints. These algorithms facilitate systems that perform song searching, song identification, and song duplication detection. In this study, a comprehensive and powerful survey of already developed algorithms is conducted. Four major music fingerprinting algorithms are evaluated for identifying and analyzing the potential hurdles that can affect their results. Since the background and environmental noise reduces the efficiency of music fingerprinting algorithms, behavioral analysis of fingerprinting algorithms is performed using audio samples of different languages and under different environmental conditions. The results of music fingerprint classification are more successful when deep learning techniques for classification are used. The testing of the acoustic feature modeling and music fingerprinting algorithms is performed using the standard dataset of iKala, MusicBrainz and MIR-1K.
In this paper we present techniques based on machine learning techniques on monitoring data for analysis of cybersecurity threats in cloud environments that incorporate enterprise applications from the fields of telecommunications and IoT. Cybersecurity is a term describing techniques for protecting computers, telecommunications equipment, applications, environments and data. In modern networks enormous volume of generated traffic can be observed. We propose several techniques such as Support Vector Machines, Neural networks and Deep Neural Networks in combination for analysis of monitoring data. An approach for combining classifier results based on performance weights is proposed. The proposed approach delivers promising results comparable to existing algorithms and is suitable for enterprise grade security applications.
In recent years, there is a surge of interest in approaches pertaining to security issues of Internet of Things deployments and applications that leverage machine learning and deep learning techniques. A key prerequisite for enabling such approaches is the development of scalable infrastructures for collecting and processing security-related datasets from IoT systems and devices. This paper introduces such a scalable and configurable data collection infrastructure for data-driven IoT security. It emphasizes the collection of (security) data from different elements of IoT systems, including individual devices and smart objects, edge nodes, IoT platforms, and entire clouds. The scalability of the introduced infrastructure stems from the integration of state of the art technologies for large scale data collection, streaming and storage, while its configurability relies on an extensible approach to modelling security data from a variety of IoT systems and devices. The approach enables the instantiation and deployment of security data collection systems over complex IoT deployments, which is a foundation for applying effective security analytics algorithms towards identifying threats, vulnerabilities and related attack patterns.
In recent years, deep convolution neural networks (DCNNs) have won many contests in machine learning, object detection, and pattern recognition. Furthermore, deep learning techniques achieved exceptional performance in image classification, reaching accuracy levels beyond human capability. Malware variants from similar categories often contain similarities due to code reuse. Converting malware samples into images can cause these patterns to manifest as image features, which can be exploited for DCNN classification. Techniques for converting malware binaries into images for visualization and classification have been reported in the literature, and while these methods do reach a high level of classification accuracy on training datasets, they tend to be vulnerable to overfitting and perform poorly on previously unseen samples. In this paper, we explore and document a variety of techniques for representing malware binaries as images with the goal of discovering a format best suited for deep learning. We implement a database for malware binaries from several families, stored in hexadecimal format. These malware samples are converted into images using various approaches and are used to train a neural network to recognize visual patterns in the input and classify malware based on the feature vectors. Each image type is assessed using a variety of learning models, such as transfer learning with existing DCNN architectures and feature extraction for support vector machine classifier training. Each technique is evaluated in terms of classification accuracy, result consistency, and time per trial. Our preliminary results indicate that improved image representation has the potential to enable more effective classification of new malware.
Deep learning techniques have demonstrated the ability to perform a variety of object recognition tasks using visible imager data; however, deep learning has not been implemented as a means to autonomously detect and assess targets of interest in a physical security system. We demonstrate the use of transfer learning on a convolutional neural network (CNN) to significantly reduce training time while keeping detection accuracy of physical security relevant targets high. Unlike many detection algorithms employed by video analytics within physical security systems, this method does not rely on temporal data to construct a background scene; targets of interest can halt motion indefinitely and still be detected by the implemented CNN. A key advantage of using deep learning is the ability for a network to improve over time. Periodic retraining can lead to better detection and higher confidence rates. We investigate training data size versus CNN test accuracy using physical security video data. Due to the large number of visible imagers, significant volume of data collected daily, and currently deployed human in the loop ground truth data, physical security systems present a unique environment that is well suited for analysis via CNNs. This could lead to the creation of algorithmic element that reduces human burden and decreases human analyzed nuisance alarms.