Biblio
Authorship attribution is a classical classification problem. We use it here to illustrate the performance of a compression-based measure that relies on the notion of relative compression. Besides comparing with recent approaches that use multiple discriminant analysis and support vector machines, we compare it with the Normalized Conditional Compression Distance (a direct approximation of the Normalized Information Distance) and the popular Normalized Compression Distance. The Normalized Relative Compression (NRC) attained 100% correct classification in the data set used, showing consistency between the compression ratio and the classification performance, a characteristic not always present in other compression-based measures.
Authorship attribution is a stylometric technique that associates text to authors based on the type of writing styles. Researchers have looked for ways to analyze the context of these texts, in some cases with limited results. Most of the approaches view information at the syntactic and physical levels and tend to ignore information from the semantic levels. In this paper, we present a technique that incorporates the use of semantic frames as a method for authorship attribution. We hypothesize that it provides a deeper view into the semantic level of texts, which is an influencing factor in a writer's style. We use a variety of online resources in a pipeline fashion to extract information about frames within the text. The results show that our “bag of frames” approach can be used successfully for stylometry.
Security in Mobile Ad Hoc networks is still ongoing research in the scientific community and it is difficult bring an overall security solution. In this paper we assess feasibility of distributed firewall solutions in the Mobile Ad Hoc Networks. Attention is also focused on different security solutions in the Ad Hoc networks. We propose a security architecture which secures network on the several layers and is the most secured solution out of analyzed materials. For this purpose we use distributed public key infrastructure, distributed firewall and intrusion detection system. Our architecture is using both symmetric and asymmetric cryptography and in this paper we present performance measurements and the security analysis of our solution.
Cyber-attacks are cheap, easy to conduct and often pose little risk in terms of attribution, but their impact could be lasting. The low attribution is because tracing cyber-attacks is primitive in the current network architecture. Moreover, even when attribution is known, the absence of enforcement provisions in international law makes cyber attacks tough to litigate, and hence attribution is hardly a deterrent. Rather than attributing attacks, we can re-look at cyber-attacks as societal events associated with social, political, economic and cultural (SPEC) motivations. Because it is possible to observe SPEC motives on the internet, social media data could be valuable in understanding cyber attacks. In this research, we use sentiment in Twitter posts to observe country-to-country perceptions, and Arbor Networks data to build ground truth of country-to-country DDoS cyber-attacks. Using this dataset, this research makes three important contributions: a) We evaluate the impact of heightened sentiments towards a country on the trend of cyber-attacks received by the country. We find that, for some countries, the probability of attacks increases by up to 27% while experiencing negative sentiments from other nations. b) Using cyber-attacks trend and sentiments trend, we build a decision tree model to find attacks that could be related to extreme sentiments. c) To verify our model, we describe three examples in which cyber-attacks follow increased tension between nations, as perceived in social media.
Malware is an ever-increasing threat to personal, corporate, and government computing systems alike. Particularly in the corporate and government sectors, the attribution of malware—including the identification of the authorship of malware as well as potentially the malefactor responsible for an attack—is of growing interest. Such malware attribution is often enabled by the fact that malware authors build on the work of others through the use of generators, libraries, and borrowed code. Determining malware phylogeny—the evolutionary history of and the derivative relations between malware—is consequently an endeavor of increasing importance, with a growing focus on the dynamic analysis of malware which involves executing a malware sample and determining the actions it takes after some period of operation. In most cases, such dynamic analysis occurs in a virtual machine, or "sandbox," in order to confine the malware to an environment in which it can do no harm to real systems. In sandbox-driven dynamic analysis of malware, a virtual machine is typically run starting from some known, malware-free baseline state. The malware is injected into the virtual machine, and the machine is allowed to run for some period of time during which the malware presumably activates. The machine is then suspended, and the current machine memory is dumped to disk. The process may then be repeated for other malware samples, each time starting from the baseline state. Stored in raw form on the disk, the dumped memory file is the same size as the virtual-machine memory, for virtual machines running modern operating systems, such memory would likely be no less than 512 MB but could be up to several GBs. If the corresponding memory dumps are to be retained for repeated analysis—as is likely to be required in order to determine a phylogeny for a large database of malware samples—lossless compression of the memory dumps is necessarily to prevent explosive disk usage. For example, the VirusShare project maintains a database of over 19 million malware samples, running these in a virtual machine with 512 MB of memory would require of 9 petabytes (PB) of storage to retain the memory dumps. In this paper, we develop a scheme for the lossless compression of memory dumps resulting from the repeated execution of malware samples in a virtual-machine sandbox. Rather than compress each memory dump individually, we capitalize on the fact that memory dumps stem from a known baseline virtual-machine state and code with respect to this baseline memory. Additionally, to further improve compression efficiency, we exploit the fact that a significant portion of the difference between the baseline memory and that of the currently running machine is the result of the loading of known executable programs and shared libraries. Consequently, we propose delta coding to compress the current virtual-machine memory dump by coding its differences with respect to a predicted memory image, with the latter formed by duplicating the loading of the executables and libraries into the baseline memory, resulting in a significant improvement in compression performance over straightforward delta coding alone. In experimental results for a body of malware samples, the proposed approach outperformed the widely used xdelta3 delta coder by approximately 20% and the popular generic gzip coder by 79%.
A major challenge in cyber-threat analysis is combining information from different sources to find the person or the group responsible for the cyber-attack. It is one of the most important technical and policy challenges in cybersecurity. The lack of ground truth for an individual responsible for an attack has limited previous studies. In this paper, we take a first step towards overcoming this limitation by building a dataset from the capture-the-flag event held at DEFCON, and propose an argumentation model based on a formal reasoning framework called DeLP (Defeasible Logic Programming) designed to aid an analyst in attributing a cyber-attack. We build models from latent variables to reduce the search space of culprits (attackers), and show that this reduction significantly improves the performance of classification-based approaches from 37% to 62% in identifying the attacker.
The microelectronics industry seeks screening tools that can be used to verify the origin of and track integrated circuits (ICs) throughout their lifecycle. Embedded circuits that measure process variation of an IC are well known. This paper adds to previous work using these circuits for studying manufacturer characteristics on final product ICs, particularly for the purpose of developing and verifying a signature for a microelectronics manufacturing facility (fab). We present the design, measurements and analysis of 159 silicon ICs which were built as a proof of concept for this purpose. 80 copies of our proof of concept IC were built at one fab, and 80 more copies were built across two lots at a second fab. Using these ICs, our prototype circuits allowed us to distinguish these two fabs with up to 98.7% accuracy and also distinguish the two lots from the second fab with up to 98.8% accuracy.
The capability to reliably and accurately identify the attacker has long been believed as one of the most effective deterrents to an attack. Ideally, the attribution of cyber attack should be automated from the attack target all the way toward the attack source on the Internet in real-time. Real-time, network-wide attack attribution, however, is every challenging, and many people have doubted whether it is feasible to have practical attack attribution on the Internet. In this paper, we look into the problem, challenges of real-time attack attribution on the Internet, and analyze what it takes to have the real-time attack attribution on the Internet. We show that it is indeed feasible and practical to attribute certain cyber attacks on the Internet in real-time. We build such a real-time attack attribution system upon the malware immunization and packet flow watermarking techniques we have developed. We demonstrate the unprecedented real-time attack attribution capability via live experiments on the Internet and Tor nodes all over the world.
Authorship attribution is a classical classification problem. We use it here to illustrate the performance of a compression-based measure that relies on the notion of relative compression. Besides comparing with recent approaches that use multiple discriminant analysis and support vector machines, we compare it with the Normalized Conditional Compression Distance (a direct approximation of the Normalized Information Distance) and the popular Normalized Compression Distance. The Normalized Relative Compression (NRC) attained 100% correct classification in the data set used, showing consistency between the compression ratio and the classification performance, a characteristic not always present in other compression-based measures.
Authorship attribution is a stylometric technique that associates text to authors based on the type of writing styles. Researchers have looked for ways to analyze the context of these texts, in some cases with limited results. Most of the approaches view information at the syntactic and physical levels and tend to ignore information from the semantic levels. In this paper, we present a technique that incorporates the use of semantic frames as a method for authorship attribution. We hypothesize that it provides a deeper view into the semantic level of texts, which is an influencing factor in a writer's style. We use a variety of online resources in a pipeline fashion to extract information about frames within the text. The results show that our “bag of frames” approach can be used successfully for stylometry.
In interconnected power systems, dynamic model reduction can be applied to generators outside the area of interest (i.e., study area) to reduce the computational cost associated with transient stability studies. This paper presents a method of deriving the reduced dynamic model of the external area based on dynamic response measurements. The method consists of three steps, namely dynamic-feature extraction, attribution, and reconstruction (DEAR). In this method, a feature extraction technique, such as singular value decomposition (SVD), is applied to the measured generator dynamics after a disturbance. Characteristic generators are then identified in the feature attribution step for matching the extracted dynamic features with the highest similarity, forming a suboptimal “basis” of system dynamics. In the reconstruction step, generator state variables such as rotor angles and voltage magnitudes are approximated with a linear combination of the characteristic generators, resulting in a quasi-nonlinear reduced model of the original system. The network model is unchanged in the DEAR method. Tests on several IEEE standard systems show that the proposed method yields better reduction ratio and response errors than the traditional coherency based reduction methods.
- « first
- ‹ previous
- 1
- 2
- 3