Biblio
The Internet-of-Things (IoT) paradigm at large continues to be compromised, hindering the privacy, dependability, security, and safety of our nations. While the operational security communities (i.e., CERTS, SOCs, CSIRT, etc.) continue to develop capabilities for monitoring cyberspace, tools which are IoT-centric remain at its infancy. To this end, we address this gap by innovating an actionable Cyber Threat Intelligence (CTI) feed related to Internet-scale infected IoT devices. The feed analyzes, in near real-time, 3.6TB of daily streaming passive measurements ( ≈ 1M pps) by applying a custom-developed learning methodology to distinguish between compromised IoT devices and non-IoT nodes, in addition to labeling the type and vendor. The feed is augmented with third party information to provide contextual information. We report on the operation, analysis, and shortcomings of the feed executed during an initial deployment period. We make the CTI feed available for ingestion through a public, authenticated API and a front-end platform.
Authorship attribution is the problem of studying an anonymous text and finding the corresponding author in a set of candidate authors. In this paper, we propose a method based on N-grams model for the problem of authorship attribution. Several measures are used to assign an anonymous text to an author. The different variants of the proposed method are implemented and validated on PAN benchmarks. The numerical results are encouraging and demonstrate the benefit of the proposed idea.
Research in combating misinformation reports many negative results: facts may not change minds, especially if they come from sources that are not trusted. Individuals can disregard and justify lies told by trusted sources. This problem is made even worse by social recommendation algorithms which help amplify conspiracy theories and information confirming one's own biases due to companies' efforts to optimize for clicks and watch time over individuals' own values and public good. As a result, more nuanced voices and facts are drowned out by a continuous erosion of trust in better information sources. Most misinformation mitigation techniques assume that discrediting, filtering, or demoting low veracity information will help news consumers make better information decisions. However, these negative results indicate that some news consumers, particularly extreme or conspiracy news consumers will not be helped. We argue that, given this background, technology solutions to combating misinformation should not simply seek facts or discredit bad news sources, but instead use more subtle nudges towards better information consumption. Repeated exposure to such nudges can help promote trust in better information sources and also improve societal outcomes in the long run. In this article, we will talk about technological solutions that can help us in developing such an approach, and introduce one such model called Trust Nudging.
Spams are unsolicited and unnecessary messages which may contain harmful codes or links for activation of malicious viruses and spywares. Increasing popularity of social networks attracts the spammers to perform malicious activities in social networks. So an efficient spam detection method is necessary for social networks. In this paper, feed forward neural network with back propagation based spam detection model is proposed. The quality of the learning process is improved by tuning initial weights of feed forward neural network using proposed enhanced step size firefly algorithm which reduces the time for finding optimal weights during the learning process. The model is applied for twitter dataset and the experimental results show that, the proposed model performs well in terms of accuracy and detection rate and has lower false positive rate.
Modern industrial control systems (ICS) act as victims of cyber attacks more often in last years. These attacks are hard to detect and their consequences can be catastrophic. Cyber attacks can cause anomalies in the work of the ICS and its technological equipment. The presence of mutual interference and noises in this equipment significantly complicates anomaly detection. Moreover, the traditional means of protection, which used in corporate solutions, require updating with each change in the structure of the industrial process. An approach based on the machine learning for anomaly detection was used to overcome these problems. It complements traditional methods and allows one to detect signal correlations and use them for anomaly detection. Additional Tennessee Eastman Process Simulation Data for Anomaly Detection Evaluation dataset was analyzed as example of industrial process. In the course of the research, correlations between the signals of the sensors were detected and preliminary data processing was carried out. Algorithms from the most common techniques of machine learning (decision trees, linear algorithms, support vector machines) and deep learning models (neural networks) were investigated for industrial process anomaly detection task. It's shown that linear algorithms are least demanding on computational resources, but they don't achieve an acceptable result and allow a significant number of errors. Decision tree-based algorithms provided an acceptable accuracy, but the amount of RAM, required for their operations, relates polynomially with the training sample volume. The deep neural networks provided the greatest accuracy, but they require considerable computing power for internal calculations.
Over the years cybercriminals have misused the Domain Name System (DNS) - a critical component of the Internet - to gain profit. Despite this persisting trend, little empirical information about the security of Top-Level Domains (TLDs) and of the overall 'health' of the DNS ecosystem exists. In this paper, we present security metrics for this ecosystem and measure the operational values of such metrics using three representative phishing and malware datasets. We benchmark entire TLDs against the rest of the market. We explicitly distinguish these metrics from the idea of measuring security performance, because the measured values are driven by multiple factors, not just by the performance of the particular market player. We consider two types of security metrics: occurrence of abuse and persistence of abuse. In conjunction, they provide a good understanding of the overall health of a TLD. We demonstrate that attackers abuse a variety of free services with good reputation, affecting not only the reputation of those services, but of entire TLDs. We find that, when normalized by size, old TLDs like .com host more bad content than new generic TLDs. We propose a statistical regression model to analyze how the different properties of TLD intermediaries relate to abuse counts. We find that next to TLD size, abuse is positively associated with domain pricing (i.e. registries who provide free domain registrations witness more abuse). Last but not least, we observe a negative relation between the DNSSEC deployment rate and the count of phishing domains.
Over the years cybercriminals have misused the Domain Name System (DNS) - a critical component of the Internet - to gain profit. Despite this persisting trend, little empirical information about the security of Top-Level Domains (TLDs) and of the overall 'health' of the DNS ecosystem exists. In this paper, we present security metrics for this ecosystem and measure the operational values of such metrics using three representative phishing and malware datasets. We benchmark entire TLDs against the rest of the market. We explicitly distinguish these metrics from the idea of measuring security performance, because the measured values are driven by multiple factors, not just by the performance of the particular market player. We consider two types of security metrics: occurrence of abuse and persistence of abuse. In conjunction, they provide a good understanding of the overall health of a TLD. We demonstrate that attackers abuse a variety of free services with good reputation, affecting not only the reputation of those services, but of entire TLDs. We find that, when normalized by size, old TLDs like .com host more bad content than new generic TLDs. We propose a statistical regression model to analyze how the different properties of TLD intermediaries relate to abuse counts. We find that next to TLD size, abuse is positively associated with domain pricing (i.e. registries who provide free domain registrations witness more abuse). Last but not least, we observe a negative relation between the DNSSEC deployment rate and the count of phishing domains.
We address security and trust in the context of a commercial IP camera. We take a hands-on approach, as we not only define abstract vulnerabilities, but we actually implement the attacks on a real camera. We then discuss the nature of the attacks and the root cause; we propose a formal model of trust that can be used to address the vulnerabilities by explicitly constraining compositionality for trust relationships.
This paper introduces SONA (Spatiotemporal system Organized for Natural Analysis), a tabletop and tangible controller system for exploring geotagged information, and more specifically, CCTV. SONA's goal is to support a more natural method of interacting with data. Our new interactions are placed in the context of a physical security environment, closed circuit television (CCTV). We present a three-layered detail on demand set of view filters for CCTV feeds on a digital map. These filters are controlled with a novel tangible device for direct interaction. We validate SONA's tangible controller approach with a user study comparing SONA with the existing CCTV multi-screen method. The results of the study show that SONA's tangible interaction method is superior to the multi-screen approach, both in terms of quantitative results, and is preferred by users.
The dazzling emergence of cyber-threats exert today's cyberspace, which needs practical and efficient capabilities for malware traffic detection. In this paper, we propose an extension to an initial research effort, namely, towards fingerprinting malicious traffic by putting an emphasis on the attribution of maliciousness to malware families. The proposed technique in the previous work establishes a synergy between automatic dynamic analysis of malware and machine learning to fingerprint badness in network traffic. Machine learning algorithms are used with features that exploit only high-level properties of traffic packets (e.g. packet headers). Besides, the detection of malicious packets, we want to enhance fingerprinting capability with the identification of malware families responsible in the generation of malicious packets. The identification of the underlying malware family is derived from a sequence of application protocols, which is used as a signature to the family in question. Furthermore, our results show that our technique achieves promising malware family identification rate with low false positives.
The dazzling emergence of cyber-threats exert today's cyberspace, which needs practical and efficient capabilities for malware traffic detection. In this paper, we propose an extension to an initial research effort, namely, towards fingerprinting malicious traffic by putting an emphasis on the attribution of maliciousness to malware families. The proposed technique in the previous work establishes a synergy between automatic dynamic analysis of malware and machine learning to fingerprint badness in network traffic. Machine learning algorithms are used with features that exploit only high-level properties of traffic packets (e.g. packet headers). Besides, the detection of malicious packets, we want to enhance fingerprinting capability with the identification of malware families responsible in the generation of malicious packets. The identification of the underlying malware family is derived from a sequence of application protocols, which is used as a signature to the family in question. Furthermore, our results show that our technique achieves promising malware family identification rate with low false positives.
Enhanced situational awareness is integral to risk management and response evaluation. Dynamic systems that incorporate both hard and soft data sources allow for comprehensive situational frameworks which can supplement physical models with conceptual notions of risk. The processing of widely available semi-structured textual data sources can produce soft information that is readily consumable by such a framework. In this paper, we augment the situational awareness capabilities of a recently proposed risk management framework (RMF) with the incorporation of soft data. We illustrate the beneficial role of the hard-soft data fusion in the characterization and evaluation of potential vessels in distress within Maritime Domain Awareness (MDA) scenarios. Risk features pertaining to maritime vessels are defined a priori and then quantified in real time using both hard (e.g., Automatic Identification System, Douglas Sea Scale) as well as soft (e.g., historical records of worldwide maritime incidents) data sources. A risk-aware metric to quantify the effectiveness of the hard-soft fusion process is also proposed. Though illustrated with MDA scenarios, the proposed hard-soft fusion methodology within the RMF can be readily applied to other domains.
The dazzling emergence of cyber-threats exert today's cyberspace, which needs practical and efficient capabilities for malware traffic detection. In this paper, we propose an extension to an initial research effort, namely, towards fingerprinting malicious traffic by putting an emphasis on the attribution of maliciousness to malware families. The proposed technique in the previous work establishes a synergy between automatic dynamic analysis of malware and machine learning to fingerprint badness in network traffic. Machine learning algorithms are used with features that exploit only high-level properties of traffic packets (e.g. packet headers). Besides, the detection of malicious packets, we want to enhance fingerprinting capability with the identification of malware families responsible in the generation of malicious packets. The identification of the underlying malware family is derived from a sequence of application protocols, which is used as a signature to the family in question. Furthermore, our results show that our technique achieves promising malware family identification rate with low false positives.