Visible to the public Biblio

Found 262 results

Filters: Keyword is Semantics  [Clear All Filters]
2017-12-12
Zhou, G., Huang, J. X..  2017.  Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval. IEEE Transactions on Knowledge and Data Engineering. 29:1226–1239.

Community question answering (cQA) has become an important issue due to the popularity of cQA archives on the Web. This paper focuses on addressing the lexical gap problem in question retrieval. Question retrieval in cQA archives aims to find the existing questions that are semantically equivalent or relevant to the queried questions. However, the lexical gap problem brings a new challenge for question retrieval in cQA. In this paper, we propose to model and learn distributed word representations with metadata of category information within cQA pages for question retrieval using two novel category powered models. One is a basic category powered model called MB-NET and the other one is an enhanced category powered model called ME-NET which can better learn the distributed word representations and alleviate the lexical gap problem. To deal with the variable size of word representation vectors, we employ the framework of fisher kernel to transform them into the fixed-length vectors. Experimental results on large-scale English and Chinese cQA data sets show that our proposed approaches can significantly outperform state-of-the-art retrieval models for question retrieval in cQA. Moreover, we further conduct our approaches on large-scale automatic evaluation experiments. The evaluation results show that promising and significant performance improvements can be achieved.

Ktob, A., Li, Z..  2017.  The Arabic Knowledge Graph: Opportunities and Challenges. 2017 IEEE 11th International Conference on Semantic Computing (ICSC). :48–52.

Semantic Web has brought forth the idea of computing with knowledge, hence, attributing the ability of thinking to machines. Knowledge Graphs represent a major advancement in the construction of the Web of Data where machines are context-aware when answering users' queries. The English Knowledge Graph was a milestone realized by Google in 2012. Even though it is a useful source of information for English users and applications, it does not offer much for the Arabic users and applications. In this paper, we investigated the different challenges and opportunities prone to the life-cycle of the construction of the Arabic Knowledge Graph (AKG) while following some best practices and techniques. Additionally, this work suggests some potential solutions to these challenges. The proprietary factor of data creates a major problem in the way of harvesting this latter. Moreover, when the Arabic data is openly available, it is generally in an unstructured form which requires further processing. The complexity of the Arabic language itself creates a further problem for any automatic or semi-automatic extraction processes. Therefore, the usage of NLP techniques is a feasible solution. Some preliminary results are presented later in this paper. The AKG has very promising outcomes for the Semantic Web in general and the Arabic community in particular. The goal of the Arabic Knowledge Graph is mainly the integration of the different isolated datasets available on the Web. Later, it can be used in both the academic (by providing a large dataset for many different research fields and enhance discovery) and commercial sectors (by improving search engines, providing metadata, interlinking businesses).

2017-11-20
Zhengqiu, H., Fangxia, X., Wenfu, L., Rongmao, H., Zhongfu, X..  2016.  Research of Secure Service Composition Based on Semantic Security Policy. 2016 IEEE International Conference on Internet of Things (iThings) and IEEE Green Computing and Communications (GreenCom) and IEEE Cyber, Physical and Social Computing (CPSCom) and IEEE Smart Data (SmartData). :246–251.

Expressing and matching the security policy of each participant accurately is the precondition to construct a secure service composition. Most schemes presently use syntactic approaches to represent and match the security policy for service composition process, which is prone to result in false negative because of lacking semantics. In this paper, a novel approach based on semantics is proposed to express and match the security policies in service composition. Through constructing a general security ontology, the definition method and matching algorithm of the semantic security policy for service composition are presented, and the matching problem of policy is translated into the subsumption reasoning problem of semantic concept. Both the theoretical analysis and experimental evaluation show that, the proposed approach can present the necessary semantic information in the representation of policy and effectively improve the accuracy of matching result, thus overcome the deficiency of the syntactic approaches, and can also simplify the definition and management of the policy at the same time, which thereby provides a more effective solution for building the secure service composition based on security policy.

You, L., Li, Y., Wang, Y., Zhang, J., Yang, Y..  2016.  A deep learning-based RNNs model for automatic security audit of short messages. 2016 16th International Symposium on Communications and Information Technologies (ISCIT). :225–229.

The traditional text classification methods usually follow this process: first, a sentence can be considered as a bag of words (BOW), then transformed into sentence feature vector which can be classified by some methods, such as maximum entropy (ME), Naive Bayes (NB), support vector machines (SVM), and so on. However, when these methods are applied to text classification, we usually can not obtain an ideal result. The most important reason is that the semantic relations between words is very important for text categorization, however, the traditional method can not capture it. Sentiment classification, as a special case of text classification, is binary classification (positive or negative). Inspired by the sentiment analysis, we use a novel deep learning-based recurrent neural networks (RNNs)model for automatic security audit of short messages from prisons, which can classify short messages(secure and non-insecure). In this paper, the feature of short messages is extracted by word2vec which captures word order information, and each sentence is mapped to a feature vector. In particular, words with similar meaning are mapped to a similar position in the vector space, and then classified by RNNs. RNNs are now widely used and the network structure of RNNs determines that it can easily process the sequence data. We preprocess short messages, extract typical features from existing security and non-security short messages via word2vec, and classify short messages through RNNs which accept a fixed-sized vector as input and produce a fixed-sized vector as output. The experimental results show that the RNNs model achieves an average 92.7% accuracy which is higher than SVM.

2017-08-02
Dolz, Manuel F., del Rio Astorga, David, Fernández, Javier, García, J. Daniel, García-Carballeira, Félix, Danelutto, Marco, Torquati, Massimo.  2016.  Embedding Semantics of the Single-Producer/Single-Consumer Lock-Free Queue into a Race Detection Tool. Proceedings of the 7th International Workshop on Programming Models and Applications for Multicores and Manycores. :20–29.

The rapid progress of multi-/many-core architectures has caused data-intensive parallel applications not yet be fully suited for getting the maximum performance. The advent of parallel programming frameworks offering structured patterns has alleviated developers' burden adapting such applications to parallel platforms. For example, the use of synchronization mechanisms in multithreaded applications is essential on shared-cache multi-core architectures. However, ensuring an appropriate use of their interfaces can be challenging, since different memory models plus instruction reordering at compiler/processor levels may influence the occurrence of data races. The benefits of race detectors are formidable in this sense, nevertheless if lock-free data structures with no high-level atomics are used, they may emit false positives. In this paper, we extend the ThreadSanitizer race detection tool in order to support semantics of the general Single-Producer/Single-Consumer (SPSC) lock-free parallel queue and to detect benign data races where it was correctly used. To perform our analysis, we leverage the FastFlow SPSC bounded lock-free queue implementation to test our extensions over a set of μ-benchmarks and real applications on a dual-socket Intel Xeon CPU E5-2695 platform. We demonstrate that this approach can reduce, on average, 30% the number of data race warning messages.

2017-03-08
Shen, M., Liu, F..  2015.  Query of Uncertain QoS of Web Service. 2015 IEEE 12th Intl Conf on Ubiquitous Intelligence and Computing and 2015 IEEE 12th Intl Conf on Autonomic and Trusted Computing and 2015 IEEE 15th Intl Conf on Scalable Computing and Communications and Its Associate. :1780–1785.

Quality of service (QoS) has been considered as a significant criterion for querying among functionally similar web services. Most researches focus on the search of QoS under certain data which may not cover some practical scenarios. Recent approaches for uncertain QoS of web service deal with discrete data domain. In this paper, we try to build the search of QoS under continuous probability distribution. We offer the definition of two kinds of queries under uncertain QoS and form the optimization approaches for specific distributions. Based on that, the search is extended to general cases. With experiments, we show the feasibility of the proposed methods.

2017-02-23
Jia, L., Sen, S., Garg, D., Datta, A..  2015.  "A Logic of Programs with Interface-Confined Code". 2015 IEEE 28th Computer Security Foundations Symposium. :512–525.

Interface-confinement is a common mechanism that secures untrusted code by executing it inside a sandbox. The sandbox limits (confines) the code's interaction with key system resources to a restricted set of interfaces. This practice is seen in web browsers, hypervisors, and other security-critical systems. Motivated by these systems, we present a program logic, called System M, for modeling and proving safety properties of systems that execute adversary-supplied code via interface-confinement. In addition to using computation types to specify effects of computations, System M includes a novel invariant type to specify the properties of interface-confined code. The interpretation of invariant type includes terms whose effects satisfy an invariant. We construct a step-indexed model built over traces and prove the soundness of System M relative to the model. System M is the first program logic that allows proofs of safety for programs that execute adversary-supplied code without forcing the adversarial code to be available for deep static analysis. System M can be used to model and verify protocols as well as system designs. We demonstrate the reasoning principles of System M by verifying the state integrity property of the design of Memoir, a previously proposed trusted computing system.

J. Zhang.  2015.  "Semantic-Based Searchable Encryption in Cloud: Issues and Challenges". 2015 First International Conference on Computational Intelligence Theory, Systems and Applications (CCITSA). :163-165.

Searchable encryption is a new developing information security technique and it enables users to search over encrypted data through keywords without having to decrypt it at first. In the last decade, many researchers are engaging in the field of searchable encryption and have proposed a series of efficient search schemes over encrypted cloud data. It is the time to survey this field to conclude a comprehensive framework by analyzing individual contributions. This paper focuses on the searchable encryption schemes in cloud. We firstly summarize the general model and threat model in searchable encryption schemes, and then present the privacy-preserving issues in these schemes. In addition, we compare the efficiency and security between semantic search and preferred search in detail. At last, some open issues and research challenges in the future are proposed.

2017-02-14
B. C. M. Cappers, J. J. van Wijk.  2015.  "SNAPS: Semantic network traffic analysis through projection and selection". 2015 IEEE Symposium on Visualization for Cyber Security (VizSec). :1-8.

Most network traffic analysis applications are designed to discover malicious activity by only relying on high-level flow-based message properties. However, to detect security breaches that are specifically designed to target one network (e.g., Advanced Persistent Threats), deep packet inspection and anomaly detection are indispensible. In this paper, we focus on how we can support experts in discovering whether anomalies at message level imply a security risk at network level. In SNAPS (Semantic Network traffic Analysis through Projection and Selection), we provide a bottom-up pixel-oriented approach for network traffic analysis where the expert starts with low-level anomalies and iteratively gains insight in higher level events through the creation of multiple selections of interest in parallel. The tight integration between visualization and machine learning enables the expert to iteratively refine anomaly scores, making the approach suitable for both post-traffic analysis and online monitoring tasks. To illustrate the effectiveness of this approach, we present example explorations on two real-world data sets for the detection and understanding of potential Advanced Persistent Threats in progress.

A. K. M. A., J. C. D..  2015.  "Execution Time Measurement of Virtual Machine Volatile Artifacts Analyzers". 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS). :314-319.

Due to a rapid revaluation in a virtualization environment, Virtual Machines (VMs) are target point for an attacker to gain privileged access of the virtual infrastructure. The Advanced Persistent Threats (APTs) such as malware, rootkit, spyware, etc. are more potent to bypass the existing defense mechanisms designed for VM. To address this issue, Virtual Machine Introspection (VMI) emerged as a promising approach that monitors run state of the VM externally from hypervisor. However, limitation of VMI lies with semantic gap. An open source tool called LibVMI address the semantic gap. Memory Forensic Analysis (MFA) tool such as Volatility can also be used to address the semantic gap. But, it needs to capture a memory dump (RAM) as input. Memory dump acquires time and its analysis time is highly crucial if Intrusion Detection System IDS (IDS) depends on the data supplied by FAM or VMI tool. In this work, live virtual machine RAM dump acquire time of LibVMI is measured. In addition, captured memory dump analysis time consumed by Volatility is measured and compared with other memory analyzer such as Rekall. It is observed through experimental results that, Rekall takes more execution time as compared to Volatility for most of the plugins. Further, Volatility and Rekall are compared with LibVMI. It is noticed that examining the volatile data through LibVMI is faster as it eliminates memory dump acquire time.

2015-05-06
Ravindran, K., Mukhopadhyay, S., Sidhanta, S., Sabbir, A..  2014.  Managing shared contexts in distributed multi-player game systems. Communication Systems and Networks (COMSNETS), 2014 Sixth International Conference on. :1-8.

In this paper, we consider the impact of a weaker model of eventual consistency on distributed multi-player games. This model is suitable for networks in which hosts can leave and join at anytime, e.g., in an intermittently connected environment. Such a consistency model is provided by the Secure Infrastructure for Networked Systems (SINS) [24], a reliable middleware framework. SINS allows agents to communicate asynchronously through a distributed transactional key-value store using anonymous publish-subscribe. It uses Lamport's Paxos protocol [17] to replicate state. We consider a multi-player maze game as example to illustrate our consistency model and the impact of network losses/delays therein. The framework based on SINS presented herein provides a vehicle for studying the effect of human elements participating in collaborative simulation of a physical world as in war games.

Sayed, B., Traore, I..  2014.  Protection against Web 2.0 Client-Side Web Attacks Using Information Flow Control. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :261-268.

The dynamic nature of the Web 2.0 and the heavy obfuscation of web-based attacks complicate the job of the traditional protection systems such as Firewalls, Anti-virus solutions, and IDS systems. It has been witnessed that using ready-made toolkits, cyber-criminals can launch sophisticated attacks such as cross-site scripting (XSS), cross-site request forgery (CSRF) and botnets to name a few. In recent years, cyber-criminals have targeted legitimate websites and social networks to inject malicious scripts that compromise the security of the visitors of such websites. This involves performing actions using the victim browser without his/her permission. This poses the need to develop effective mechanisms for protecting against Web 2.0 attacks that mainly target the end-user. In this paper, we address the above challenges from information flow control perspective by developing a framework that restricts the flow of information on the client-side to legitimate channels. The proposed model tracks sensitive information flow and prevents information leakage from happening. The proposed model when applied to the context of client-side web-based attacks is expected to provide a more secure browsing environment for the end-user.

Mokhtar, B., Eltoweissy, M..  2014.  Towards a Data Semantics Management System for Internet Traffic. New Technologies, Mobility and Security (NTMS), 2014 6th International Conference on. :1-5.

Although current Internet operations generate voluminous data, they remain largely oblivious of traffic data semantics. This poses many inefficiencies and challenges due to emergent or anomalous behavior impacting the vast array of Internet elements such as services and protocols. In this paper, we propose a Data Semantics Management System (DSMS) for learning Internet traffic data semantics to enable smarter semantics- driven networking operations. We extract networking semantics and build and utilize a dynamic ontology of network concepts to better recognize and act upon emergent or abnormal behavior. Our DSMS utilizes: (1) Latent Dirichlet Allocation algorithm (LDA) for latent features extraction and semantics reasoning; (2) big tables as a cloud-like data storage technique to maintain large-scale data; and (3) Locality Sensitive Hashing algorithm (LSH) for reducing data dimensionality. Our preliminary evaluation using real Internet traffic shows the efficacy of DSMS for learning behavior of normal and abnormal traffic data and for accurately detecting anomalies at low cost.
 

Sayed, B., Traore, I..  2014.  Protection against Web 2.0 Client-Side Web Attacks Using Information Flow Control. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :261-268.

The dynamic nature of the Web 2.0 and the heavy obfuscation of web-based attacks complicate the job of the traditional protection systems such as Firewalls, Anti-virus solutions, and IDS systems. It has been witnessed that using ready-made toolkits, cyber-criminals can launch sophisticated attacks such as cross-site scripting (XSS), cross-site request forgery (CSRF) and botnets to name a few. In recent years, cyber-criminals have targeted legitimate websites and social networks to inject malicious scripts that compromise the security of the visitors of such websites. This involves performing actions using the victim browser without his/her permission. This poses the need to develop effective mechanisms for protecting against Web 2.0 attacks that mainly target the end-user. In this paper, we address the above challenges from information flow control perspective by developing a framework that restricts the flow of information on the client-side to legitimate channels. The proposed model tracks sensitive information flow and prevents information leakage from happening. The proposed model when applied to the context of client-side web-based attacks is expected to provide a more secure browsing environment for the end-user.

Schaefer, J..  2014.  A semantic self-management approach for service platforms. Network Operations and Management Symposium (NOMS), 2014 IEEE. :1-4.

Future personal living environments feature an increasing number of convenience-, health- and security-related applications provided by distributed services, which do not only support users but require tasks such as installation, configuration and continuous administration. These tasks are becoming tiresome, complex and error-prone. One way to escape this situation is to enable service platforms to configure and manage themselves. The approach presented here extends services with semantic descriptions to enable platform-independent autonomous service level management using model driven architecture and autonomic computing concepts. It has been implemented as a OSGi-based semantic autonomic manager, whose concept, prototypical implementation and evaluation are presented.
 

Boruah, A., Hazarika, S.M..  2014.  An MEBN framework as a dynamic firewall's knowledge flow architecture. Signal Processing and Integrated Networks (SPIN), 2014 International Conference on. :249-254.

Dynamic firewalls with stateful inspection have added a lot of security features over the stateless traditional static filters. Dynamic firewalls need to be adaptive. In this paper, we have designed a framework for dynamic firewalls based on probabilistic ontology using Multi Entity Bayesian Networks (MEBN) logic. MEBN extends ordinary Bayesian networks to allow representation of graphical models with repeated substructures and can express a probability distribution over models of any consistent first order theory. The motivation of our proposed work is about preventing novel attacks (i.e. those attacks for which no signatures have been generated yet). The proposed framework is in two important parts: first part is the data flow architecture which extracts important connection based features with the prime goal of an explicit rule inclusion into the rule base of the firewall; second part is the knowledge flow architecture which uses semantic threat graph as well as reasoning under uncertainty to fulfill the required objective of providing futuristic threat prevention technique in dynamic firewalls.

2015-05-05
Lomotey, R.K., Deters, R..  2014.  Terms Mining in Document-Based NoSQL: Response to Unstructured Data. Big Data (BigData Congress), 2014 IEEE International Congress on. :661-668.

Unstructured data mining has become topical recently due to the availability of high-dimensional and voluminous digital content (known as "Big Data") across the enterprise spectrum. The Relational Database Management Systems (RDBMS) have been employed over the past decades for content storage and management, but, the ever-growing heterogeneity in today's data calls for a new storage approach. Thus, the NoSQL database has emerged as the preferred storage facility nowadays since the facility supports unstructured data storage. This creates the need to explore efficient data mining techniques from such NoSQL systems since the available tools and frameworks which are designed for RDBMS are often not directly applicable. In this paper, we focused on topics and terms mining, based on clustering, in document-based NoSQL. This is achieved by adapting the architectural design of an analytics-as-a-service framework and the proposal of the Viterbi algorithm to enhance the accuracy of the terms classification in the system. The results from the pilot testing of our work show higher accuracy in comparison to some previously proposed techniques such as the parallel search.

Babour, A., Khan, J.I..  2014.  Tweet Sentiment Analytics with Context Sensitive Tone-Word Lexicon. Web Intelligence (WI) and Intelligent Agent Technologies (IAT), 2014 IEEE/WIC/ACM International Joint Conferences on. 1:392-399.

In this paper we propose a twitter sentiment analytics that mines for opinion polarity about a given topic. Most of current semantic sentiment analytics depends on polarity lexicons. However, many key tone words are frequently bipolar. In this paper we demonstrate a technique which can accommodate the bipolarity of tone words by context sensitive tone lexicon learning mechanism where the context is modeled by the semantic neighborhood of the main target. Performance analysis shows that ability to contextualize the tone word polarity significantly improves the accuracy.

Eun Hee Ko, Klabjan, D..  2014.  Semantic Properties of Customer Sentiment in Tweets. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :657-663.

An increasing number of people are using online social networking services (SNSs), and a significant amount of information related to experiences in consumption is shared in this new media form. Text mining is an emerging technique for mining useful information from the web. We aim at discovering in particular tweets semantic patterns in consumers' discussions on social media. Specifically, the purposes of this study are twofold: 1) finding similarity and dissimilarity between two sets of textual documents that include consumers' sentiment polarities, two forms of positive vs. negative opinions and 2) driving actual content from the textual data that has a semantic trend. The considered tweets include consumers' opinions on US retail companies (e.g., Amazon, Walmart). Cosine similarity and K-means clustering methods are used to achieve the former goal, and Latent Dirichlet Allocation (LDA), a popular topic modeling algorithm, is used for the latter purpose. This is the first study which discover semantic properties of textual data in consumption context beyond sentiment analysis. In addition to major findings, we apply LDA (Latent Dirichlet Allocations) to the same data and drew latent topics that represent consumers' positive opinions and negative opinions on social media.

Sayed, B., Traore, I..  2014.  Protection against Web 2.0 Client-Side Web Attacks Using Information Flow Control. Advanced Information Networking and Applications Workshops (WAINA), 2014 28th International Conference on. :261-268.

The dynamic nature of the Web 2.0 and the heavy obfuscation of web-based attacks complicate the job of the traditional protection systems such as Firewalls, Anti-virus solutions, and IDS systems. It has been witnessed that using ready-made toolkits, cyber-criminals can launch sophisticated attacks such as cross-site scripting (XSS), cross-site request forgery (CSRF) and botnets to name a few. In recent years, cyber-criminals have targeted legitimate websites and social networks to inject malicious scripts that compromise the security of the visitors of such websites. This involves performing actions using the victim browser without his/her permission. This poses the need to develop effective mechanisms for protecting against Web 2.0 attacks that mainly target the end-user. In this paper, we address the above challenges from information flow control perspective by developing a framework that restricts the flow of information on the client-side to legitimate channels. The proposed model tracks sensitive information flow and prevents information leakage from happening. The proposed model when applied to the context of client-side web-based attacks is expected to provide a more secure browsing environment for the end-user.

Wenmin Xiao, Jianhua Sun, Hao Chen, Xianghua Xu.  2014.  Preventing Client Side XSS with Rewrite Based Dynamic Information Flow. Parallel Architectures, Algorithms and Programming (PAAP), 2014 Sixth International Symposium on. :238-243.

This paper presents the design and implementation of an information flow tracking framework based on code rewrite to prevent sensitive information leaks in browsers, combining the ideas of taint and information flow analysis. Our system has two main processes. First, it abstracts the semantic of JavaScript code and converts it to a general form of intermediate representation on the basis of JavaScript abstract syntax tree. Second, the abstract intermediate representation is implemented as a special taint engine to analyze tainted information flow. Our approach can ensure fine-grained isolation for both confidentiality and integrity of information. We have implemented a proof-of-concept prototype, named JSTFlow, and have deployed it as a browser proxy to rewrite web applications at runtime. The experiment results show that JSTFlow can guarantee the security of sensitive data and detect XSS attacks with about 3x performance overhead. Because it does not involve any modifications to the target system, our system is readily deployable in practice.
 

Petullo, W.M., Wenyuan Fei, Solworth, J.A., Gavlin, P..  2014.  Ethos' Deeply Integrated Distributed Types. Security and Privacy Workshops (SPW), 2014 IEEE. :167-180.

Programming languages have long incorporated type safety, increasing their level of abstraction and thus aiding programmers. Type safety eliminates whole classes of security-sensitive bugs, replacing the tedious and error-prone search for such bugs in each application with verifying the correctness of the type system. Despite their benefits, these protections often end at the process boundary, that is, type safety holds within a program but usually not to the file system or communication with other programs. Existing operating system approaches to bridge this gap require the use of a single programming language or common language runtime. We describe the deep integration of type safety in Ethos, a clean-slate operating system which requires that all program input and output satisfy a recognizer before applications are permitted to further process it. Ethos types are multilingual and runtime-agnostic, and each has an automatically generated unique type identifier. Ethos bridges the type-safety gap between programs by (1) providing a convenient mechanism for specifying the types each program may produce or consume, (2) ensuring that each type has a single, distributed-system-wide recognizer implementation, and (3) inescapably enforcing these type constraints.
 

2015-05-04
Gimenez, A., Gamblin, T., Rountree, B., Bhatele, A., Jusufi, I., Bremer, P.-T., Hamann, B..  2014.  Dissecting On-Node Memory Access Performance: A Semantic Approach. High Performance Computing, Networking, Storage and Analysis, SC14: International Conference for. :166-176.

Optimizing memory access is critical for performance and power efficiency. CPU manufacturers have developed sampling-based performance measurement units (PMUs) that report precise costs of memory accesses at specific addresses. However, this data is too low-level to be meaningfully interpreted and contains an excessive amount of irrelevant or uninteresting information. We have developed a method to gather fine-grained memory access performance data for specific data objects and regions of code with low overhead and attribute semantic information to the sampled memory accesses. This information provides the context necessary to more effectively interpret the data. We have developed a tool that performs this sampling and attribution and used the tool to discover and diagnose performance problems in real-world applications. Our techniques provide useful insight into the memory behaviour of applications and allow programmers to understand the performance ramifications of key design decisions: domain decomposition, multi-threading, and data motion within distributed memory systems.
 

Khosmood, F., Nico, P.L., Woolery, J..  2014.  User identification through command history analysis. Computational Intelligence in Cyber Security (CICS), 2014 IEEE Symposium on. :1-7.

As any veteran of the editor wars can attest, Unix users can be fiercely and irrationally attached to the commands they use and the manner in which they use them. In this work, we investigate the problem of identifying users out of a large set of candidates (25-97) through their command-line histories. Using standard algorithms and feature sets inspired by natural language authorship attribution literature, we demonstrate conclusively that individual users can be identified with a high degree of accuracy through their command-line behavior. Further, we report on the best performing feature combinations, from the many thousands that are possible, both in terms of accuracy and generality. We validate our work by experimenting on three user corpora comprising data gathered over three decades at three distinct locations. These are the Greenberg user profile corpus (168 users), Schonlau masquerading corpus (50 users) and Cal Poly command history corpus (97 users). The first two are well known corpora published in 1991 and 2001 respectively. The last is developed by the authors in a year-long study in 2014 and represents the most recent corpus of its kind. For a 50 user configuration, we find feature sets that can successfully identify users with over 90% accuracy on the Cal Poly, Greenberg and one variant of the Schonlau corpus, and over 87% on the other Schonlau variant.

Jantsch, A., Tammemae, K..  2014.  A framework of awareness for artificial subjects. Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2014 International Conference on. :1-3.

A small battery driven bio-patch, attached to the human body and monitoring various vital signals such as temperature, humidity, heart activity, muscle and brain activity, is an example of a highly resource constrained system, that has the demanding task to assess correctly the state of the monitored subject (healthy, normal, weak, ill, improving, worsening, etc.), and its own capabilities (attached to subject, working sensors, sufficient energy supply, etc.). These systems and many other systems would benefit from a sense of itself and its environment to improve robustness and sensibility of its behavior. Although we can get inspiration from fields like neuroscience, robotics, AI, and control theory, the tight resource and energy constraints imply that we have to understand accurately what technique leads to a particular feature of awareness, how it contributes to improved behavior, and how it can be implemented cost-efficiently in hardware or software. We review the concepts of environment- and self-models, semantic interpretation, semantic attribution, history, goals and expectations, prediction, and self-inspection, how they contribute to awareness and self-awareness, and how they contribute to improved robustness and sensibility of behavior.