Visible to the public Biblio

Found 165 results

Filters: Keyword is natural language processing  [Clear All Filters]
2018-06-11
Saleh, C., Mohsen, M..  2017.  FBG security fence for intrusion detection. 2017 International Conference on Engineering MIS (ICEMIS). :1–5.

The following topics are dealt with: feature extraction; data mining; support vector machines; mobile computing; photovoltaic power systems; mean square error methods; fault diagnosis; natural language processing; control system synthesis; and Internet of Things.

2018-06-07
Jiao, X., Luo, M., Lin, J. H., Gupta, R. K..  2017.  An assessment of vulnerability of hardware neural networks to dynamic voltage and temperature variations. 2017 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). :945–950.

As a problem solving method, neural networks have shown broad applicability from medical applications, speech recognition, and natural language processing. This success has even led to implementation of neural network algorithms into hardware. In this paper, we explore two questions: (a) to what extent microelectronic variations affects the quality of results by neural networks; and (b) if the answer to first question represents an opportunity to optimize the implementation of neural network algorithms. Regarding first question, variations are now increasingly common in aggressive process nodes and typically manifest as an increased frequency of timing errors. Combating variations - due to process and/or operating conditions - usually results in increased guardbands in circuit and architectural design, thus reducing the gains from process technology advances. Given the inherent resilience of neural networks due to adaptation of their learning parameters, one would expect the quality of results produced by neural networks to be relatively insensitive to the rising timing error rates caused by increased variations. On the contrary, using two frequently used neural networks (MLP and CNN), our results show that variations can significantly affect the inference accuracy. This paper outlines our assessment methodology and use of a cross-layer evaluation approach that extracts hardware-level errors from twenty different operating conditions and then inject such errors back to the software layer in an attempt to answer the second question posed above.

2018-02-27
Bours, P., Brahmanpally, S..  2017.  Language Dependent Challenge-Based Keystroke Dynamics. 2017 International Carnahan Conference on Security Technology (ICCST). :1–6.

Keystroke Dynamics can be used as an unobtrusive method to enhance password authentication, by checking the typing rhythm of the user. Fixed passwords will give an attacker the possibility to try to learn to mimic the typing behaviour of a victim. In this paper we will investigate the performance of a keystroke dynamic (KD) system when the users have to type given (English) words. Under the assumption that it is easy to type words in your native language and difficult in a foreign language will we also test the performance of such a challenge-based KD system when the challenges are not common English words, but words in the native language of the user. We collected data from participants with 6 different native language backgrounds and had them type random 8-12 character words in each of the 6 languages. The participants also typed random English words and random French words. English was assumed to be a language familiar to all participants, while French was not a native language to any participant and most likely most participants were not fluent in French. Analysis showed that using language dependent words gave a better performance of the challenge-based KD compared to an all English challenge-based system. When using words in a native language, then the performance of the participants with their mother-tongue equal to that native language had a similar performance compared to the all English challenge-based system, but the non-native speakers had an FMR that was significantly lower than the native language speakers. We found that native Telugu speakers had an FMR of less than 1% when writing Spanish or Slovak words. We also found that duration features were best to recognize genuine users, but latency features performed best to recognize non-native impostor users.

2018-02-06
Pappu, Aasish, Blanco, Roi, Mehdad, Yashar, Stent, Amanda, Thadani, Kapil.  2017.  Lightweight Multilingual Entity Extraction and Linking. Proceedings of the Tenth ACM International Conference on Web Search and Data Mining. :365–374.

Text analytics systems often rely heavily on detecting and linking entity mentions in documents to knowledge bases for downstream applications such as sentiment analysis, question answering and recommender systems. A major challenge for this task is to be able to accurately detect entities in new languages with limited labeled resources. In this paper we present an accurate and lightweight, multilingual named entity recognition (NER) and linking (NEL) system. The contributions of this paper are three-fold: 1) Lightweight named entity recognition with competitive accuracy; 2) Candidate entity retrieval that uses search click-log data and entity embeddings to achieve high precision with a low memory footprint; and 3) efficient entity disambiguation. Our system achieves state-of-the-art performance on TAC KBP 2013 multilingual data and on English AIDA CONLL data.

2018-01-10
Wang, S., Yan, Q., Chen, Z., Yang, B., Zhao, C., Conti, M..  2017.  TextDroid: Semantics-based detection of mobile malware using network flows. 2017 IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). :18–23.

The wide-spreading mobile malware has become a dreadful issue in the increasingly popular mobile networks. Most of the mobile malware relies on network interface to coordinate operations, steal users' private information, and launch attack activities. In this paper, we propose TextDroid, an effective and automated malware detection method combining natural language processing and machine learning. TextDroid can extract distinguishable features (n-gram sequences) to characterize malware samples. A malware detection model is then developed to detect mobile malware using a Support Vector Machine (SVM) classifier. The trained SVM model presents a superior performance on two different data sets, with the malware detection rate reaching 96.36% in the test set and 76.99% in an app set captured in the wild, respectively. In addition, we also design a flow header visualization method to visualize the highlighted texts generated during the apps' network interactions, which assists security researchers in understanding the apps' complex network activities.

Buber, E., Dırı, B., Sahingoz, O. K..  2017.  Detecting phishing attacks from URL by using NLP techniques. 2017 International Conference on Computer Science and Engineering (UBMK). :337–342.

Nowadays, cyber attacks affect many institutions and individuals, and they result in a serious financial loss for them. Phishing Attack is one of the most common types of cyber attacks which is aimed at exploiting people's weaknesses to obtain confidential information about them. This type of cyber attack threats almost all internet users and institutions. To reduce the financial loss caused by this type of attacks, there is a need for awareness of the users as well as applications with the ability to detect them. In the last quarter of 2016, Turkey appears to be second behind China with an impact rate of approximately 43% in the Phishing Attack Analysis report between 45 countries. In this study, firstly, the characteristics of this type of attack are explained, and then a machine learning based system is proposed to detect them. In the proposed system, some features were extracted by using Natural Language Processing (NLP) techniques. The system was implemented by examining URLs used in Phishing Attacks before opening them with using some extracted features. Many tests have been applied to the created system, and it is seen that the best algorithm among the tested ones is the Random Forest algorithm with a success rate of 89.9%.

Thaler, S., Menkonvski, V., Petkovic, M..  2017.  Towards a neural language model for signature extraction from forensic logs. 2017 5th International Symposium on Digital Forensic and Security (ISDFS). :1–6.
Signature extraction is a critical preprocessing step in forensic log analysis because it enables sophisticated analysis techniques to be applied to logs. Currently, most signature extraction frameworks either use rule-based approaches or handcrafted algorithms. Rule-based systems are error-prone and require high maintenance effort. Hand-crafted algorithms use heuristics and tend to work well only for specialized use cases. In this paper we present a novel approach to extract signatures from forensic logs that is based on a neural language model. This language model learns to identify mutable and non-mutable parts in a log message. We use this information to extract signatures. Neural language models have shown to work extremely well for learning complex relationships in natural language text. We experimentally demonstrate that our model can detect which parts are mutable with an accuracy of 86.4%. We also show how extracted signatures can be used for clustering log lines.
Barreira, R., Pinheiro, V., Furtado, V..  2017.  A framework for digital forensics analysis based on semantic role labeling. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :66–71.
This article describes a framework for semantic annotation of texts that are submitted for forensic analysis, based on Frame Semantics, and a knowledge base of Forensic Frames - FrameFOR. We demonstrate through experimental evaluations that the application of the Semantic Role Labeling (SRL) techniques and Natural Language Processing (NLP) in digital forensic increases the performance of the forensic experts in terms of agility, precision and recall.
Devyatkin, D., Smirnov, I., Ananyeva, M., Kobozeva, M., Chepovskiy, A., Solovyev, F..  2017.  Exploring linguistic features for extremist texts detection (on the material of Russian-speaking illegal texts). 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :188–190.

In this paper we present results of a research on automatic extremist text detection. For this purpose an experimental dataset in the Russian language was created. According to the Russian legislation we cannot make it publicly available. We compared various classification methods (multinomial naive Bayes, logistic regression, linear SVM, random forest, and gradient boosting) and evaluated the contribution of differentiating features (lexical, semantic and psycholinguistic) to classification quality. The results of experiments show that psycholinguistic and semantic features are promising for extremist text detection.

Bhattacharjee, S. Das, Talukder, A., Al-Shaer, E., Doshi, P..  2017.  Prioritized active learning for malicious URL detection using weighted text-based features. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI). :107–112.

Data analytics is being increasingly used in cyber-security problems, and found to be useful in cases where data volumes and heterogeneity make it cumbersome for manual assessment by security experts. In practical cyber-security scenarios involving data-driven analytics, obtaining data with annotations (i.e. ground-truth labels) is a challenging and known limiting factor for many supervised security analytics task. Significant portions of the large datasets typically remain unlabelled, as the task of annotation is extensively manual and requires a huge amount of expert intervention. In this paper, we propose an effective active learning approach that can efficiently address this limitation in a practical cyber-security problem of Phishing categorization, whereby we use a human-machine collaborative approach to design a semi-supervised solution. An initial classifier is learnt on a small amount of the annotated data which in an iterative manner, is then gradually updated by shortlisting only relevant samples from the large pool of unlabelled data that are most likely to influence the classifier performance fast. Prioritized Active Learning shows a significant promise to achieve faster convergence in terms of the classification performance in a batch learning framework, and thus requiring even lesser effort for human annotation. An useful feature weight update technique combined with active learning shows promising classification performance for categorizing Phishing/malicious URLs without requiring a large amount of annotated training samples to be available during training. In experiments with several collections of PhishMonger's Targeted Brand dataset, the proposed method shows significant improvement over the baseline by as much as 12%.

Zheng, Y., Shi, Y., Guo, K., Li, W., Zhu, L..  2017.  Enhanced word embedding with multiple prototypes. 2017 4th International Conference on Industrial Economics System and Industrial Security Engineering (IEIS). :1–5.

Word representation is one of the basic word repressentation methods in natural language processing, which mapped a word into a dense real-valued vector space based on a hypothesis: words with similar context have similar meanings. Models like NNLM, C&W, CBOW, Skip-gram have been designed for word embeddings learning, and get widely used in many NLP tasks. However, these models assume that one word had only one semantics meaning which is contrary to the real language rules. In this paper we pro-pose a new word unit with multiple meanings and an algorithm to distinguish them by it's context. This new unit can be embedded in most language models and get series of efficient representations by learning variable embeddings. We evaluate a new model MCBOW that integrate CBOW with our word unit on word similarity evaluation task and some downstream experiments, the result indicated our new model can learn different meanings of a word and get a better result on some other tasks.

Meltsov, V. Y., Lesnikov, V. A., Dolzhenkova, M. L..  2017.  Intelligent system of knowledge control with the natural language user interface. 2017 International Conference "Quality Management,Transport and Information Security, Information Technologies" (IT QM IS). :671–675.
This electronic document is a “live” template and already defines the components of your paper [title, text, heads, etc.] in its style sheet. The paper considers the possibility and necessity of using in modern control and training systems with a natural language interface methods and mechanisms, characteristic for knowledge processing systems. This symbiosis assumes the introduction of specialized inference machines into the testing systems. For the effective operation of such an intelligent interpreter, it is necessary to “translate” the user's answers into one of the known forms of the knowledge representation, for example, into the expressions (rules) of the first-order predicate calculus. A lexical processor, performing morphological, syntactic and semantic analysis, solves this task. To simplify further work with the rules, the Skolem-transformation is used, which allows to get rid of quantifiers and to present semantic structures in the form of sequents (clauses, disjuncts). The basic principles of operation of the inference machine are described, which is the main component of the developed intellectual subsystem. To improve the performance of the machine, one of the fastest methods was chosen - a parallel method of deductive inference based on the division of clauses. The parallelism inherent in the method, and the use of the dataflow architecture, allow parallel computations in the output machine to be implemented without additional effort on the part of the programmer. All this makes it possible to reduce the time for comparing the sequences stored in the knowledge base by several times as compared to traditional inference mechanisms that implement various versions of the principle of resolutions. Formulas and features of the technique of numerical estimation of the user's answers are given. In general, the development of the human-computer dialogue capabilities in test systems- through the development of a specialized module for processing knowledge, will increase the intelligence of such systems and allow us to directly consider the semantics of sentences, more accurately determine the relevance of the user's response to standard knowledge and, ultimately, get rid of the skeptical attitude of many managers to machine testing systems.
Alzhrani, K., Rudd, E. M., Chow, C. E., Boult, T. E..  2017.  Automated U.S diplomatic cables security classification: Topic model pruning vs. classification based on clusters. 2017 IEEE International Symposium on Technologies for Homeland Security (HST). :1–6.
The U.S Government has been the target for cyberattacks from all over the world. Just recently, former President Obama accused the Russian government of the leaking emails to Wikileaks and declared that the U.S. might be forced to respond. While Russia denied involvement, it is clear that the U.S. has to take some defensive measures to protect its data infrastructure. Insider threats have been the cause of other sensitive information leaks too, including the infamous Edward Snowden incident. Most of the recent leaks were in the form of text. Due to the nature of text data, security classifications are assigned manually. In an adversarial environment, insiders can leak texts through E-mail, printers, or any untrusted channels. The optimal defense is to automatically detect the unstructured text security class and enforce the appropriate protection mechanism without degrading services or daily tasks. Unfortunately, existing Data Leak Prevention (DLP) systems are not well suited for detecting unstructured texts. In this paper, we compare two recent approaches in the literature for text security classification, evaluating them on actual sensitive text data from the WikiLeaks dataset.
Gupta, P., Goswami, A., Koul, S., Sartape, K..  2017.  IQS-intelligent querying system using natural language processing. 2017 International conference of Electronics, Communication and Aerospace Technology (ICECA). 2:410–413.
Modern databases contain an enormous amount of information stored in a structured format. This information is processed to acquire knowledge. However, the process of information extraction from a Database System is cumbersome for non-expert users as it requires an extensive knowledge of DBMS languages. Therefore, an inevitable need arises to bridge the gap between user requirements and the provision of a simple information retrieval system whereby the role of a specialized Database Administrator is annulled. In this paper, we propose a methodology for building an Intelligent Querying System (IQS) by which a user can fire queries in his own (natural) language. The system first parses the input sentences and then generates SQL queries from the natural language expressions of the input. These queries are in turn mapped with the desired information to generate the required output. Hence, it makes the information retrieval process simple, effective and reliable.
2017-12-12
Zhou, G., Huang, J. X..  2017.  Modeling and Learning Distributed Word Representation with Metadata for Question Retrieval. IEEE Transactions on Knowledge and Data Engineering. 29:1226–1239.

Community question answering (cQA) has become an important issue due to the popularity of cQA archives on the Web. This paper focuses on addressing the lexical gap problem in question retrieval. Question retrieval in cQA archives aims to find the existing questions that are semantically equivalent or relevant to the queried questions. However, the lexical gap problem brings a new challenge for question retrieval in cQA. In this paper, we propose to model and learn distributed word representations with metadata of category information within cQA pages for question retrieval using two novel category powered models. One is a basic category powered model called MB-NET and the other one is an enhanced category powered model called ME-NET which can better learn the distributed word representations and alleviate the lexical gap problem. To deal with the variable size of word representation vectors, we employ the framework of fisher kernel to transform them into the fixed-length vectors. Experimental results on large-scale English and Chinese cQA data sets show that our proposed approaches can significantly outperform state-of-the-art retrieval models for question retrieval in cQA. Moreover, we further conduct our approaches on large-scale automatic evaluation experiments. The evaluation results show that promising and significant performance improvements can be achieved.

Ktob, A., Li, Z..  2017.  The Arabic Knowledge Graph: Opportunities and Challenges. 2017 IEEE 11th International Conference on Semantic Computing (ICSC). :48–52.

Semantic Web has brought forth the idea of computing with knowledge, hence, attributing the ability of thinking to machines. Knowledge Graphs represent a major advancement in the construction of the Web of Data where machines are context-aware when answering users' queries. The English Knowledge Graph was a milestone realized by Google in 2012. Even though it is a useful source of information for English users and applications, it does not offer much for the Arabic users and applications. In this paper, we investigated the different challenges and opportunities prone to the life-cycle of the construction of the Arabic Knowledge Graph (AKG) while following some best practices and techniques. Additionally, this work suggests some potential solutions to these challenges. The proprietary factor of data creates a major problem in the way of harvesting this latter. Moreover, when the Arabic data is openly available, it is generally in an unstructured form which requires further processing. The complexity of the Arabic language itself creates a further problem for any automatic or semi-automatic extraction processes. Therefore, the usage of NLP techniques is a feasible solution. Some preliminary results are presented later in this paper. The AKG has very promising outcomes for the Semantic Web in general and the Arabic community in particular. The goal of the Arabic Knowledge Graph is mainly the integration of the different isolated datasets available on the Web. Later, it can be used in both the academic (by providing a large dataset for many different research fields and enhance discovery) and commercial sectors (by improving search engines, providing metadata, interlinking businesses).

2017-11-03
Park, A. J., Beck, B., Fletche, D., Lam, P., Tsang, H. H..  2016.  Temporal analysis of radical dark web forum users. 2016 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM). :880–883.
Extremist groups have turned to the Internet and social media sites as a means of sharing information amongst one another. This research study analyzes forum posts and finds people who show radical tendencies through the use of natural language processing and sentiment analysis. The forum data being used are from six Islamic forums on the Dark Web which are made available for security research. This research project uses a POS tagger to isolate keywords and nouns that can be utilized with the sentiment analysis program. Then the sentiment analysis program determines the polarity of the post. The post is scored as either positive or negative. These scores are then divided into monthly radical scores for each user. Once these time clusters are mapped, the change in opinions of the users over time may be interpreted as rising or falling levels of radicalism. Each user is then compared on a timeline to other radical users and events to determine possible connections or relationships. The ability to analyze a forum for an overall change in attitude can be an indicator of unrest and possible radical actions or terrorism.
2017-09-19
Sun, Bo, Fujino, Akinori, Mori, Tatsuya.  2016.  POSTER: Toward Automating the Generation of Malware Analysis Reports Using the Sandbox Logs. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :1814–1816.

In recent years, the number of new examples of malware has continued to increase. To create effective countermeasures, security specialists often must manually inspect vast sandbox logs produced by the dynamic analysis method. Conversely, antivirus vendors usually publish malware analysis reports on their website. Because malware analysis reports and sandbox logs do not have direct connections, when analyzing sandbox logs, security specialists can not benefit from the information described in such expert reports. To address this issue, we developed a system called ReGenerator that automates the generation of reports related to sandbox logs by making use of existing reports published by antivirus vendors. Our system combines several techniques, including the Jaccard similarity, Natural Language Processing (NLP), and Generation (NLG), to produce concise human-readable reports describing malicious behavior for security specialists.

Plachkov, Alex, Abielmona, Rami, Harb, Moufid, Falcon, Rafael, Inkpen, Diana, Groza, Voicu, Petriu, Emil.  2016.  Automatic Course of Action Generation Using Soft Data for Maritime Domain Awareness. Proceedings of the 2016 on Genetic and Evolutionary Computation Conference Companion. :1071–1078.

Information Fusion (IF) systems have long exploited data provided by hard (physics-based) sensors with the aspiration of making sense of the environment they are monitoring. In recent times, the IF community has recognized the potential of utilizing data generated by people, also known as soft data. In this study, we demonstrate how course of action (CoA) generation, one of the key elements of Level 3 High-Level Information Fusion and a vital component for security and defense decision support systems, can be augmented using soft (human-derived) data for improved mission effectiveness. This conceptualization is validated through an elaborate experiment situated in the maritime world. To the best of the authors' knowledge, this is the first study to apply soft data to automatic CoA generation in the maritime domain.

Mercado, Iván Tactuk, Munaiah, Nuthan, Meneely, Andrew.  2016.  The Impact of Cross-platform Development Approaches for Mobile Applications from the User's Perspective. Proceedings of the International Workshop on App Market Analytics. :43–49.

Mobile app developers today have a hard decision to make: to independently develop native apps for different operating systems or to develop an app that is cross-platform compatible. The availability of different tools and approaches to support cross-platform app development only makes the decision harder. In this study, we used user reviews of apps to empirically understand the relationship (if any) between the approach used in the development of an app and its perceived quality. We used Natural Language Processing (NLP) models to classify 787,228 user reviews of the Android version and iOS version of 50 apps as complaints in one of four quality concerns: performance, usability, security, and reliability. We found that hybrid apps (on both Android and iOS platforms) tend to be more prone to user complaints than interpreted/generated apps. In a study of Facebook, an app that underwent a change in development approach from hybrid to native, we found that change in the development approach was accompanied by a reduction in user complaints about performance and reliability.

Su, Jiawei, Yoshioka, Katsunari, Shikata, Junji, Matsumoto, Tsutomu.  2016.  An Efficient Method for Detecting Obfuscated Suspicious JavaScript Based on Text Pattern Analysis. Proceedings of the 2016 ACM International on Workshop on Traffic Measurements for Cybersecurity. :3–11.

The malicious JavaScript is a common springboard for attackers to launch several types of network attacks, such as Drive-by-Download and malicious PDF delivery attack. In order to elude detection of signature matching, malicious JavaScript is often packed (so-called "obfuscation") with diversified algorithms therefore the occurrence of obfuscation is always a good pointer for potential maliciousness. In this investigation, we propose a light weight approach for quickly filtering obfuscated JavaScript by a novel method of tokenizing JavaScript text at letter level and information-theoretic measures, based on the previous work in the domain of detecting obfuscated malicious code as well as the pattern analysis of natural languages. The new approach is apparently time efficient compared to existing systems since it processes much less objects while it is also proved to be able to reach the acceptable detection accuracies.

Zhu, Ziyun, Dumitras, Tudor.  2016.  FeatureSmith: Automatically Engineering Features for Malware Detection by Mining the Security Literature. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security. :767–778.

Malware detection increasingly relies on machine learning techniques, which utilize multiple features to separate the malware from the benign apps. The effectiveness of these techniques primarily depends on the manual feature engineering process, based on human knowledge and intuition. However, given the adversaries' efforts to evade detection and the growing volume of publications on malware behaviors, the feature engineering process likely draws from a fraction of the relevant knowledge. We propose an end-to-end approach for automatic feature engineering. We describe techniques for mining documents written in natural language (e.g. scientific papers) and for representing and querying the knowledge about malware in a way that mirrors the human feature engineering process. Specifically, we first identify abstract behaviors that are associated with malware, and then we map these behaviors to concrete features that can be tested experimentally. We implement these ideas in a system called FeatureSmith, which generates a feature set for detecting Android malware. We train a classifier using these features on a large data set of benign and malicious apps. This classifier achieves a 92.5% true positive rate with only 1% false positives, which is comparable to the performance of a state-of-the-art Android malware detector that relies on manually engineered features. In addition, FeatureSmith is able to suggest informative features that are absent from the manually engineered set and to link the features generated to abstract concepts that describe malware behaviors.

Xie, Tao, Enck, William.  2016.  Text Analytics for Security: Tutorial. Proceedings of the Symposium and Bootcamp on the Science of Security. :124–125.

Computing systems that make security decisions often fail to take into account human expectations. This failure occurs because human expectations are typically drawn from in textual sources (e.g., mobile application description and requirements documents) and are hard to extract and codify. Recently, researchers in security and software engineering have begun using text analytics to create initial models of human expectation. In this tutorial, we provide an introduction to popular techniques and tools of natural language processing (NLP) and text mining, and share our experiences in applying text analytics to security problems. We also highlight the current challenges of applying these techniques and tools for addressing security problems. We conclude the tutorial with discussion of future research directions.

Tromer, Eran, Schuster, Roei.  2016.  DroidDisintegrator: Intra-Application Information Flow Control in Android Apps. Proceedings of the 11th ACM on Asia Conference on Computer and Communications Security. :401–412.

In mobile platforms and their app markets, controlling app permissions and preventing abuse of private information are crucial challenges. Information Flow Control (IFC) is a powerful approach for formalizing and answering user concerns such as: "Does this app send my geolocation to the Internet?" Yet despite intensive research efforts, IFC has not been widely adopted in mainstream programming practice. Abstract We observe that the typical structure of Android apps offers an opportunity for a novel and effective application of IFC. In Android, an app consists of a collection of a few dozen "components", each in charge of some high-level functionality. Most components do not require access to most resources. These components are a natural and effective granularity at which to apply IFC (as opposed to the typical process-level or language-level granularity). By assigning different permission labels to each component, and limiting information flow between components, it is possible to express and enforce IFC constraints. Yet nuances of the Android platform, such as its multitude of discretionary (and somewhat arcane) communication channels, raise challenges in defining and enforcing component boundaries. Abstract We build a system, DroidDisintegrator, which demonstrates the viability of component-level IFC for expressing and controlling app behavior. DroidDisintegrator uses dynamic analysis to generate IFC policies for Android apps, repackages apps to embed these policies, and enforces the policies at runtime. We evaluate DroidDisintegrator on dozens of apps.

Rauter, Tobias, Höller, Andrea, Iber, Johannes, Kreiner, Christian.  2016.  Static and Dynamic Integrity Properties Patterns. Proceedings of the 21st European Conference on Pattern Languages of Programs. :14:1–14:11.

Integrity is a crucial property in current computing systems. Due to natural or human-made (malicious and non-malicious) faults this property can be violated. Therefore, many methodologies and patterns that check or verify the integrity of systems or data have been introduced. However, integrity as a property cannot be identified directly. Existing methodologies tackle this problem by identifying other, computable, properties of the system and use a policy that describes how these properties reflect the integrity of the overall system. It is thus a critical task to select the right properties that reflect the integrity of a system in such a way that given integrity requirements are met. To ease this process, we introduce two new patterns, Static Integrity Properties and Dynamic Integrity Properties to classify the properties. Static Integrity Properties are used to ensure the integrity of a component prior it's use (e.g., the integrity of an executable binary), while Dynamic Integrity Properties are used to ensure the integrity of a component during run-time (e.g., properties that reflect the component's behavior or state transitions). Based on an exemplary embedded control system, we show typical use cases to help the system or software architect to choose the right class of integrity properties for the targeted system.