Biblio
With the rise in the underground Internet economy, automated malicious programs popularly known as malwares have become a major threat to computers and information systems connected to the internet. Properties such as self healing, self hiding and ability to deceive the security devices make these software hard to detect and mitigate. Therefore, the detection and the mitigation of such malicious software is a major challenge for researchers and security personals. The conventional systems for the detection and mitigation of such threats are mostly signature based systems. Major drawback of such systems are their inability to detect malware samples for which there is no signature available in their signature database. Such malwares are known as zero day malware. Moreover, more and more malware writers uses obfuscation technology such as polymorphic and metamorphic, packing, encryption, to avoid being detected by antivirus. Therefore, the traditional signature based detection system is neither effective nor efficient for the detection of zero-day malware. Hence to improve the effectiveness and efficiency of malware detection system we are using classification method based on structural information and behavioral specifications. In this paper we have used both static and dynamic analysis approaches. In static analysis we are extracting the features of an executable file followed by classification. In dynamic analysis we are taking the traces of executable files using NtTrace within controlled atmosphere. Experimental results obtained from our algorithm indicate that our proposed algorithm is effective in extracting malicious behavior of executables. Further it can also be used to detect malware variants.
Cyber Threat Intelligence (CTI) sharing facilitates a comprehensive understanding of adversary activity and enables enterprise networks to prioritize their cyber defense technologies. To that end, we introduce HogMap, a novel software-defined infrastructure that simplifies and incentivizes collaborative measurement and monitoring of cyber-threat activity. HogMap proposes to transform the cyber-threat monitoring landscape by integrating several novel SDN-enabled capabilities: (i) intelligent in-place filtering of malicious traffic, (ii) dynamic migration of interesting and extraordinary traffic and (iii) a software-defined marketplace where various parties can opportunistically subscribe to and publish cyber-threat intelligence services in a flexible manner. We present the architectural vision and summarize our preliminary experience in developing and operating an SDN-based HoneyGrid, which spans three enterprises and implements several of the enabling capabilities (e.g., traffic filtering, traffic forwarding and connection migration). We find that SDN technologies greatly simplify the design and deployment of such globally distributed and elastic HoneyGrids.
SMS (Short Messaging Service) is a text messaging service for mobile users to exchange short text messages. It is also widely used to provide SMS-powered services (e.g., mobile banking). With the rapid deployment of all-IP 4G mobile networks, the underlying technology of SMS evolves from the legacy circuit-switched network to the IMS (IP Multimedia Subsystem) system over packet-switched network. In this work, we study the insecurity of the IMS-based SMS. We uncover its security vulnerabilities and exploit them to devise four SMS attacks: silent SMS abuse, SMS spoofing, SMS client DoS, and SMS spamming. We further discover that those SMS threats can propagate towards SMS-powered services, thereby leading to three malicious attacks: social network account hijacking, unauthorized donation, and unauthorized subscription. Our analysis reveals that the problems stem from the loose security regulations among mobile phones, carrier networks, and SMS-powered services. We finally propose remedies to the identified security issues.
To adapt to the rapidly evolving landscape of cyber threats, security professionals are actively exchanging Indicators of Compromise (IOC) (e.g., malware signatures, botnet IPs) through public sources (e.g. blogs, forums, tweets, etc.). Such information, often presented in articles, posts, white papers etc., can be converted into a machine-readable OpenIOC format for automatic analysis and quick deployment to various security mechanisms like an intrusion detection system. With hundreds of thousands of sources in the wild, the IOC data are produced at a high volume and velocity today, which becomes increasingly hard to manage by humans. Efforts to automatically gather such information from unstructured text, however, is impeded by the limitations of today's Natural Language Processing (NLP) techniques, which cannot meet the high standard (in terms of accuracy and coverage) expected from the IOCs that could serve as direct input to a defense system. In this paper, we present iACE, an innovation solution for fully automated IOC extraction. Our approach is based upon the observation that the IOCs in technical articles are often described in a predictable way: being connected to a set of context terms (e.g., "download") through stable grammatical relations. Leveraging this observation, iACE is designed to automatically locate a putative IOC token (e.g., a zip file) and its context (e.g., "malware", "download") within the sentences in a technical article, and further analyze their relations through a novel application of graph mining techniques. Once the grammatical connection between the tokens is found to be in line with the way that the IOC is commonly presented, these tokens are extracted to generate an OpenIOC item that describes not only the indicator (e.g., a malicious zip file) but also its context (e.g., download from an external source). Running on 71,000 articles collected from 45 leading technical blogs, this new approach demonstrates a remarkable performance: it generated 900K OpenIOC items with a precision of 95% and a coverage over 90%, which is way beyond what the state-of-the-art NLP technique and industry IOC tool can achieve, at a speed of thousands of articles per hour. Further, by correlating the IOCs mined from the articles published over a 13-year span, our study sheds new light on the links across hundreds of seemingly unrelated attack instances, particularly their shared infrastructure resources, as well as the impacts of such open-source threat intelligence on security protection and evolution of attack strategies.
The popularity of cloud hosting services also brings in new security challenges: it has been reported that these services are increasingly utilized by miscreants for their malicious online activities. Mitigating this emerging threat, posed by such "bad repositories" (simply Bar), is challenging due to the different hosting strategy to traditional hosting service, the lack of direct observations of the repositories by those outside the cloud, the reluctance of the cloud provider to scan its customers' repositories without their consent, and the unique evasion strategies employed by the adversary. In this paper, we took the first step toward understanding and detecting this emerging threat. Using a small set of "seeds" (i.e., confirmed Bars), we identified a set of collective features from the websites they serve (e.g., attempts to hide Bars), which uniquely characterize the Bars. These features were utilized to build a scanner that detected over 600 Bars on leading cloud platforms like Amazon, Google, and 150K sites, including popular ones like groupon.com, using them. Highlights of our study include the pivotal roles played by these repositories on malicious infrastructures and other important discoveries include how the adversary exploited legitimate cloud repositories and why the adversary uses Bars in the first place that has never been reported. These findings bring such malicious services to the spotlight and contribute to a better understanding and ultimately eliminating this new threat.
The insider threat has been subject of extensive study and many approaches from technical perspective to behavioral perspective and psychological perspective have been proposed to detect or mitigate it. However, it still remains one of the most difficult security issues to combat. In this paper, we propose an ongoing effort on developing a systematic framework to address insider threat challenges by laying a scientific foundation for defensive deception,leveraging moving target defense (MTD), an emerging technique for providing proactive security measurements, and integrating deception and MTD into attribute-based access control (ABAC).
Understanding and fending off attack campaigns against organizations, companies and individuals, has become a global struggle. As today's threat actors become more determined and organized, isolated efforts to detect and reveal threats are no longer effective. Although challenging, this situation can be significantly changed if information about security incidents is collected, shared and analyzed across organizations. To this end, different exchange data formats such as STIX, CyBOX, or IODEF have been recently proposed and numerous CERTs are adopting these threat intelligence standards to share tactical and technical threat insights. However, managing, analyzing and correlating the vast amount of data available from different sources to identify relevant attack patterns still remains an open problem. In this paper we present Mantis, a platform for threat intelligence that enables the unified analysis of different standards and the correlation of threat data trough a novel type-agnostic similarity algorithm based on attributed graphs. Its unified representation allows the security analyst to discover similar and related threats by linking patterns shared between seemingly unrelated attack campaigns through queries of different complexity. We evaluate the performance of Mantis as an information retrieval system for threat intelligence in different experiments. In an evaluation with over 14,000 CyBOX objects, the platform enables retrieving relevant threat reports with a mean average precision of 80%, given only a single object from an incident, such as a file or an HTTP request. We further illustrate the performance of this analysis in two case studies with the attack campaigns Stuxnet and Regin.
In this demo, we address the problem of detecting anomalies on the Internet backbone in near real-time. Many of today's incidents may only become visible from inspecting multiple data sources and by considering multiple vantage points simultaneously. We present a setup based on the distributed forensic platform VAST that was extended to import various data streams from passive measurements and incident reporting at multiple locations, and perform an effective correlation analysis shortly after the data becomes exposed to our queries.
A Distributed Denial of Service (DDoS) attack is an attempt to make an online service, a network, or even an entire organization, unavailable by saturating it with traffic from multiple sources. DDoS attacks are among the most common and most devastating threats that network defenders have to watch out for. DDoS attacks are becoming bigger, more frequent, and more sophisticated. Volumetric attacks are the most common types of DDoS attacks. A DDoS attack is considered volumetric, or high-rate, when within a short period of time it generates a large amount of packets or a high volume of traffic. High-rate attacks are well-known and have received much attention in the past decade; however, despite several detection and mitigation strategies have been designed and implemented, high-rate attacks are still halting the normal operation of information technology infrastructures across the Internet when the protection mechanisms are not able to cope with the aggregated capacity that the perpetrators have put together. With this in mind, the present paper aims to propose and test a distributed and collaborative architecture for online high-rate DDoS attack detection and mitigation based on an in-memory distributed graph data structure and unsupervised machine learning algorithms that leverage real-time streaming data and analytics. We have successfully tested our proposed mechanism using a real-world DDoS attack dataset at its original rate in pursuance of reproducing the conditions of an actual large scale attack.
In the context of the emerging Internet of Things (IoT), a proliferation of wireless connectivity can be expected. That ubiquitous wireless communication will be hard to centrally manage and control, and can be expected to be opaque to end users. As a result, owners and users of physical space are threatened to lose control over their digital environments. In this work, we propose the idea of an IoTScanner. The IoTScanner integrates a range of radios to allow local reconnaissance of existing wireless infrastructure and participating nodes. It enumerates such devices, identifies connection patterns, and provides valuable insights for technical support and home users alike. Using our IoTScanner, we investigate metrics that could be used to classify devices and identify privacy threats in an IoT neighborhood.
Though the telecommunication protocol ARP provides the most prominent service for data transmission in the network by providing the physical layer address for any host's network layer address, its stateless nature remains one of the most well-known opportunities for the attacker community and ultimate threat for the hosts in the network. ARP cache poisoning results in numerous attacks, of which the most noteworthy ones MITM, host impersonation and DoS attacks. This paper presents various recent mitigation methods and proposes a novel mitigation system for ARP cache Poisoning Attacks. The proposed system works as follows: for any ARP Request or Reply messages a time stamp is generated. When it is received or sent by a host, the host will make cross layer inspection and IP-MAC pair matching with ARP table Entry. If ARP table entry matches and cross layer consistency is ensured then ARP reply with Time Stamp is sent. If in both the cases evaluated to be bogus packet, then the IP-MAC pair is added to the untrusted list and further packet inspection is done to ensure no attack has been deployed onto the network. The time is also noted for each entry made into the ARP table which makes ARP stateful. The system is evaluated based on criteria specified by the researchers.
Distributed Denial of Service (DDoS) attacks on web applications have been a persistent threat. Existing approaches for mitigating application layer DDoS attacks have limitations such low detection rate and inability to detect attacks targeting resource files. In this work, we propose Application layer DDoS (App-DDoS) attack detection framework by leveraging the concepts of Term Frequency (TF)-Inverse Document Frequency (IDF) and Latent Semantic Indexing (LSI). The approach involves analyzing web server logs to identify popular pages using TF-IDF; building normal resource access profile; generating query of accessed resources; and applying LSI technique to determine the similarity between a given session and known good sessions. A high-level of dissimilarity triggers a DDoS attack warning. We apply the proposed approach to traffics generated from three PHP applications. The initial results suggest that the proposed approach can identify ongoing DDoS attacks against web applications.
Vehicular Ad-Hoc Network (VANET) is a form of Peer-to-Peer (P2P) wireless communication between vehicles, which is characterized by the high mobility. In practice, VANET can be utilized to cater connections via multi-hop communication between vehicles to provide traffic information seamlessly, such as traffic jam and traffic accident, without the need of dedicated centralized infrastructure. Although dedicated infrastructures may also be involved in VANET, such as Road Side Units (RSUs), most of the time VANET relies solely on Vehicle-to-Vehicle (V2V) communication, which makes it vulnerable to several potential attacks in P2P based communication, as there are no trusted authorities that provide authentication and security. One of the potential threats is a Sybil attack, wherein an adversary uses a considerable number of forged identities to illegitimately infuse false or biased information which may mislead a system into making decisions benefiting the adversary. Avoiding Sybil attacks in VANET is a difficult problem, as there are typically no trusted authorities that provide cryptographic assurance of Sybil resilience. This paper presents a technique to detect and mitigate Sybil attacks, which requires no dedicated infrastructure, by utilizing just V2V communication. The proposed method work based on underlying assumption that says the mobility of vehicles in high vehicle density and the limited transmission power of the adversary creates unique groups of vehicle neighbors at a certain time point, which can be calculated in a statistical fashion providing a temporal and spatial analysis to verify real and impersonated vehicle identities. The proposed method also covers the mitigation procedures to create a trust model and announce neighboring vehicles regarding the detected tempered identities in a secure way utilizing Diffie-Hellman key distribution. This paper also presents discussions concerning the proposed approach with regard to benefits and drawbacks of sparse road condition and other potential threats.
Embedded systems are prone to security attacks from their limited resources available for self-protection and unsafe language typically used for application programming. Attacks targeting control flow is one of the most common exploitations for embedded systems. We propose a hardware-level, effective, and low overhead countermeasure to mitigate these types of attacks. In the proposed method, a Built-in Secure Register Bank (BSRB) is introduced to the processor micro-architecture to store the return addresses of subroutines. The inconsistency on the return addresses will direct the processor to select a clean copy to resume the normal control flow and mitigate the security threat. This proposed countermeasure is inaccessible for the programmer and does not require any compiler support, thus achieving better flexibility than software-based countermeasures. Experimental results show that the proposed method only increases the area and power by 3.8% and 4.4%, respectively, over the baseline OpenRISC processor.
Data deduplication [3] is able to effectively identify and eliminate redundant data and only maintain a single copy of files and chunks. Hence, it is widely used in cloud storage systems to save the users' network bandwidth for uploading data. However, the occurrence of deduplication can be easily identified by monitoring and analyzing network traffic, which leads to the risk of user privacy leakage. The attacker can carry out a very dangerous side channel attack, i.e., learn-the-remaining-information (LRI) attack, to reveal users' privacy information by exploiting the side channel of network traffic in deduplication [1]. In the LRI attack, the attacker knows a large part of the target file in the cloud and tries to learn the remaining unknown parts via uploading all possible versions of the file's content. For example, the attacker knows all the contents of the target file X except the sensitive information \texttheta. To learn the sensitive information, the attacker needs to upload m files with all possible values of \texttheta, respectively. If a file Xd with the value \textthetad is deduplicated and other files are not, the attacker knows that the information \texttheta = \textthetad. In the threat model of the LRI attack, we consider a general cloud storage service model that includes two entities, i.e., the user and cloud storage server. The attack is launched by the users who aim to steal the privacy information of other users [1]. The attacker can act as a user via its own account or use multiple accounts to disguise as multiple users. The cloud storage server communicates with the users through Internet. The connections from the clients to the cloud storage server are encrypted by SSL or TLS protocol. Hence, the attacker can monitor and measure the amount of network traffic between the client and server but cannot intercept and analyze the contents of the transmitted data due to the encryption. The attacker can then perform the sophisticated traffic analysis with sufficient computing resources. We propose a simple yet effective scheme, called randomized redundant chunk scheme (RRCS), to significantly mitigate the risk of the LRI attack while maintaining the high bandwidth efficiency of deduplication. The basic idea behind RRCS is to add randomized redundant chunks to mix up the real deduplication states of files used for the LRI attack, which effectively obfuscates the view of the attacker, who attempts to exploit the side channel of network traffic for the LRI attack. RRCS includes three key function modules, range generation (RG), secure bounds setting (SBS), and security-irrelevant redundancy elimination (SRE). When uploading the random-number redundant chunks, RRCS first uses RG to generate a fixed range [0,$łambda$N] ($łambda$ $ε$ (0,1]), in which the number of added redundant chunks is randomly chosen, where N is the total number of chunks in a file and $łambda$ is a system parameter. However, the fixed range may cause a security issue. SBS is used to deal with the bounds of the fixed range to avoid the security issue. There may exist security-irrelevant redundant chunks in RRCS. SRE reduces the security-irrelevant redundant chunks to improve the deduplication efficiency. The design details are presented in our technical report [5]. Our security analysis demonstrates RRCS can significantly reduce the risk of the LRI attack [5]. We examine the performance of RRCS using three real-world trace-based datasets, i.e., Fslhomes [2], MacOS [2], and Onefull [4], and compare RRCS with the randomized threshold scheme (RTS) [1]. Our experimental results show that source-based deduplication eliminates 100% data redundancy which however has no security guarantee. File-level (chunk-level) RTS only eliminates 8.1% – 16.8% (9.8% – 20.3%) redundancy, due to only eliminating the redundancy of the files (chunks) that have many copies. RRCS with $łambda$ = 0.5 eliminates 76.1% – 78.0% redundancy and RRCS with $łambda$ = 1 eliminates 47.9% – 53.6% redundancy.
With smart phones being indispensable in people's everyday life, Android malware has posed serious threats to their security, making its detection of utmost concern. To protect legitimate users from the evolving Android malware attacks, machine learning-based systems have been successfully deployed and offer unparalleled flexibility in automatic Android malware detection. In these systems, based on different feature representations, various kinds of classifiers are constructed to detect Android malware. Unfortunately, as classifiers become more widely deployed, the incentive for defeating them increases. In this paper, we explore the security of machine learning in Android malware detection on the basis of a learning-based classifier with the input of a set of features extracted from the Android applications (apps). We consider different importances of the features associated with their contributions to the classification problem as well as their manipulation costs, and present a novel feature selection method (named SecCLS) to make the classifier harder to be evaded. To improve the system security while not compromising the detection accuracy, we further propose an ensemble learning approach (named SecENS) by aggregating the individual classifiers that are constructed using our proposed feature selection method SecCLS. Accordingly, we develop a system called SecureDroid which integrates our proposed methods (i.e., SecCLS and SecENS) to enhance security of machine learning-based Android malware detection. Comprehensive experiments on the real sample collections from Comodo Cloud Security Center are conducted to validate the effectiveness of SecureDroid against adversarial Android malware attacks by comparisons with other alternative defense methods. Our proposed secure-learning paradigm can also be readily applied to other malware detection tasks.
A number of online services nowadays rely upon machine learning to extract valuable information from data collected in the wild. This exposes learning algorithms to the threat of data poisoning, i.e., a coordinate attack in which a fraction of the training data is controlled by the attacker and manipulated to subvert the learning process. To date, these attacks have been devised only against a limited class of binary learning algorithms, due to the inherent complexity of the gradient-based procedure used to optimize the poisoning points (a.k.a. adversarial training examples). In this work, we first extend the definition of poisoning attacks to multiclass problems. We then propose a novel poisoning algorithm based on the idea of back-gradient optimization, i.e., to compute the gradient of interest through automatic differentiation, while also reversing the learning procedure to drastically reduce the attack complexity. Compared to current poisoning strategies, our approach is able to target a wider class of learning algorithms, trained with gradient-based procedures, including neural networks and deep learning architectures. We empirically evaluate its effectiveness on several application examples, including spam filtering, malware detection, and handwritten digit recognition. We finally show that, similarly to adversarial test examples, adversarial training examples can also be transferred across different learning algorithms.