Visible to the public Biblio

Found 164 results

Filters: Keyword is CMU  [Clear All Filters]
2017-10-12
Ryan Wagner, Matthew Fredrikson, David Garlan.  2017.  An Advanced Persistent Threat Exemplar.

Security researchers do not have sufficient example systems for conducting research on advanced persistent threats, and companies and agencies that experience attacks in the wild are reluctant to release detailed information that can be examined. In this paper, we describe an Advanced Persistent Threat Exemplar that is intended to provide a real-world attack scenario with sufficient complexity for reasoning about defensive system adaptation, while not containing so much information as to be too complex. It draws from actual published attacks and experiences as a security engineer by the authors.

2017-08-01
Daniel M. Best, Jaspreet Bhatia, Elena Peterson, Travis Breaux.  2017.  Improved cyber threat indicator sharing by scoring privacy risk. 2017 IEEE International Symposium on Technologies for Homeland Security (HST).

Information security can benefit from real-time cyber threat indicator sharing, in which companies and government agencies share their knowledge of emerging cyberattacks to benefit their sector and society at large. As attacks become increasingly sophisticated by exploiting behavioral dimensions of human computer operators, there is an increased risk to systems that store personal information. In addition, risk increases as individuals blur the boundaries between workplace and home computing (e.g., using workplace computers for personal reasons). This paper describes an architecture to leverage individual perceptions of privacy risk to compute privacy risk scores over cyber threat indicator data. Unlike security risk, which is a risk to a particular system, privacy risk concerns an individual's personal information being accessed and exploited. The architecture integrates tools to extract information entities from textual threat reports expressed in the STIX format and privacy risk estimates computed using factorial vignettes to survey individual risk perceptions. The architecture aims to optimize for scalability and adaptability to achieve real-time risk scoring.

2017-07-12
Gabriel Ferreira.  2017.  Software certification in practice: how are standards being applied? ICSE-C '17 Proceedings of the 39th International Conference on Software Engineering Companion.

Certification schemes exist to regulate software systems and prevent them from being deployed before they are judged fit to use. However, practitioners are often unsatisfied with the efficiency of certification standards and processes. In this study, we analyzed two certification standards, Common Criteria and DO-178C, and collected insights from literature and from interviews with subject-matter experts to identify concepts affecting the efficiency of certification processes. Our results show that evaluation time, reusability of evaluation artifacts, and composition of systems and certified artifacts are barriers to achieve efficient certification.

2017-07-11
Alireza Sadeghi, Naeem Esfahani, Sam Malek.  2017.  Ensuring the Consistency of Adaptation through Inter- and Intra-Component Dependency Analysis. ACM Transactions on Software Engineering and Methodology (TOSEM). 26(1)

Dynamic adaptation should not leave a software system in an inconsistent state, as it could lead to failure. Prior research has used inter-component dependency models of a system to determine a safe interval for the adaptation of its components, where the most important tradeoff is between disruption in the operations of the system and reachability of safe intervals. This article presents Savasana, which automatically analyzes a software system’s code to extract both inter- and intra-component dependencies. In this way, Savasana is able to obtain more fine-grained models compared to previous approaches. Savasana then uses the detailed models to find safe adaptation intervals that cannot be determined using techniques from prior research. This allows Savasana to achieve a better tradeoff between disruption and reachability. The article demonstrates how Savasana infers safe adaptation intervals for components of a software system under various use cases and conditions.

Alireza Sadeghi, Hamid Bagheri, Joshua Garcia, Sam Malek.  2017.  A Taxonomy and Qualitative Comparison of Program Analysis Techniques for Security Assessment of Android Software. IEEE Transactions on Software Engineering. 43(6)

In parallel with the meteoric rise of mobile software, we are witnessing an alarming escalation in the number and sophistication of the security threats targeted at mobile platforms, particularly Android, as the dominant platform. While existing research has made significant progress towards detection and mitigation of Android security, gaps and challenges remain. This paper contributes a comprehensive taxonomy to classify and characterize the state-of-the-art research in this area. We have carefully followed the systematic literature review process, and analyzed the results of more than 300 research papers, resulting in the most comprehensive and elaborate investigation of the literature in this area of research. The systematic analysis of the research literature has revealed patterns, trends, and gaps in the existing literature, and underlined key challenges and opportunities that will shape the focus of future research efforts.

Mahmoud Hammad, Hamid Bagheri, Sam Malek.  2017.  DELDroid: Determination and Enforcement of Least-Privilege Architecture in Android. 2017 IEEE International Conference on Software Architecture.

Modern mobile platforms rely on a permission model to guard the system's resources and apps. In Android, since the permissions are granted at the granularity of apps, and all components belonging to an app inherit those permissions, an app's components are typically over-privileged, i.e., components are granted more privileges than they need to complete their tasks. Systematic violation of least-privilege principle in Android has shown to be the root cause of many security vulnerabilities. To mitigate this issue, we have developed DELDROID, an automated system for determination of least privilege architecture in Android and its enforcement at runtime. A key contribution of our approach is the ability to limit the privileges granted to apps without the need to modify them. DELDROID utilizes static program analysis techniques to extract the exact privileges each component needs for providing its functionality. A Multiple-Domain Matrix representation of the system's architecture is then used to automatically analyze the security posture of the system and derive its least-privilege architecture. Our experiments on hundreds of real world apps corroborate DELDROID's ability in effectively establishing the least-privilege architecture and its benefits in alleviating the security threats.

Joshua Tan, Lujo Bauer, Joseph Bonneau, Lorrie Cranor, Jeremy Thomas, Blase Ur.  2017.  Can Unicorns Help Users Compare Crypto Key Fingerprints? CHI '17 Proceedings of the 2017 CHI Conference on Human Factors in Computing Systems.

Many authentication schemes ask users to manually compare compact representations of cryptographic keys, known as fingerprints. If the fingerprints do not match, that may signal a man-in-the-middle attack. An adversary performing an attack may use a fingerprint that is similar to the target fingerprint, but not an exact match, to try to fool inattentive users. Fingerprint representations should thus be both usable and secure. We tested the usability and security of eight fingerprint representations under different configurations. In a 661-participant between-subjects experiment, participants compared fingerprints under realistic conditions and were subjected to a simulated attack. The best configuration allowed attacks to succeed 6% of the time; the worst 72%. We find the seemingly effective compare-and-select approach performs poorly for key fingerprints and that graphical fingerprint representations, while intuitive and fast, vary in performance. We identify some fingerprint representations as particularly promising.

Casey Canfield, Alex Davis, Baruch Fischhoff, Alain Forget, Sarah Pearman, Jeremy Thomas.  2017.  Replication: Challenges in Using Data Logs to Validate Phishing Detection Ability Metrics. 13th Symposium on Usable Privacy and Security (SOUPS).

The Security Behavior Observatory (SBO) is a longitudinal field-study of computer security habits that provides a novel dataset for validating computer security metrics. This paper demonstrates a new strategy for validating phishing detection ability metrics by comparing performance on a phishing signal detection task with data logs found in the SBO. We report: (1) a test of the robustness of performance on the signal detection task by replicating Canfield, Fischhoff and Davis (2016), (2) an assessment of the task's construct validity, and (3) evaluation of its predictive validity using data logs. We find that members of the SBO sample had similar signal detection ability compared to members of the previous mTurk sample and that performance on the task correlated with the Security Behavior Intentions Scale (SeBIS). However, there was no evidence of predictive validity, as the signal detection task performance was unrelated to computer security outcomes in the SBO, including the presence of malicious URLs, malware, and malicious files. We discuss the implications of these findings and the challenges of comparing behavior on structured experimental tasks to behavior in complex real-world settings.

Tingting Yu, Witawas Srisa-an, Gregg Rothermel.  2017.  An automated framework to support testing for process-level race conditions. Software: Testing, Verification, and Reliability .

Race conditions are difficult to detect because they usually occur only under specific execution interleavings. Numerous program analysis and testing techniques have been proposed to detect race conditions between threads on single applications. However, most of these techniques neglect races that occur at the process level due to complex system event interactions. This article presents a framework, SIMEXPLORER, that allows engineers to effectively test for process-level race conditions. SIMEXPLORER first uses dynamic analysis techniques to observe system execution, identify program locations of interest, and report faults related to oracles. Next, it uses virtualization to achieve the fine-grained controllability needed to exercise event interleavings that are likely to expose races. We evaluated the effectiveness of SIMEXPLORER on 24 real-world applications containing both known and unknown process-level race conditions. Our results show that SIMEXPLORER is effective at detecting these race conditions, while incurring an overhead that is acceptable given its effectiveness improvements.

Junjie Qian, Hong Jiang, Witawas Srisa-an, Sharad Seth.  2017.  Energy-efficient I/O Thread Schedulers for NVMe SSDs on NUMA. CCGrid '17 Proceedings of the 17th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing.

Non-volatile memory express (NVMe) based SSDs and the NUMA platform are widely adopted in servers to achieve faster storage speed and more powerful processing capability. As of now, very little research has been conducted to investigate the performance and energy efficiency of the stateof-the-art NUMA architecture integrated with NVMe SSDs, an emerging technology used to host parallel I/O threads. As this technology continues to be widely developed and adopted, we need to understand the runtime behaviors of such systems in order to design software runtime systems that deliver optimal performance while consuming only the necessary amount of energy. This paper characterizes the runtime behaviors of a Linuxbased NUMA system employing multiple NVMe SSDs. Our comprehensive performance and energy-efficiency study using massive numbers of parallel I/O threads shows that the penalty due to CPU contention is much smaller than that due to remote access of NVMe SSDs. Based on this insight, we develop a dynamic “lesser evil” algorithm called ESN, to minimize the impact of these two types of penalties. ESN is an energyefficient profiling-based I/O thread scheduler for managing I/O threads accessing NVMe SSDs on NUMA systems. Our empirical evaluation shows that ESN can achieve optimal I/O throughput and latency while consuming up to 50% less energy and using fewer CPUs.

Yutaka Tsutano, Shakthi Bachala, Witawas Srisa-an, Gregg Rothermel, Jackson Dinh.  2017.  An Efficient, Robust, and Scalable Approach for Analyzing Interacting Android Apps. 39th International Conference on Software Engineering.

When multiple apps on an Android platform interact, faults and security vulnerabilities can occur. Software engineers need to be able to analyze interacting apps to detect such problems. Current approaches for performing such analyses, however, do not scale to the numbers of apps that may need to be considered, and thus, are impractical for application to realworld scenarios. In this paper, we introduce JITANA, a program analysis framework designed to analyze multiple Android apps simultaneously. By using a classloader-based approach instead of a compiler-based approach such as SOOT, JITANA is able to simultaneously analyze large numbers of interacting apps, perform on-demand analysis of large libraries, and effectively analyze dynamically generated code. Empirical studies of JITANA show that it is substantially more efficient than a state-of-theart approach, and that it can effectively and efficiently analyze complex apps including Facebook, Pokemon Go, and Pandora ´ that the state-of-the-art approach cannot handle.

Michael Coblenz, Whitney Nelson, Jonathan Aldrich, Brad Myers, Joshua Sunshine.  2017.  Glacier: Transitive Class Immutability for Java. 39th International Conference on Software Engineering.

Though immutability has been long-proposed as a way to prevent bugs in software, little is known about how to make immutability support in programming languages effective for software engineers. We designed a new formalism that extends Java to support transitive class immutability, the form of immutability for which there is the strongest empirical support, and implemented that formalism in a tool called Glacier. We applied Glacier successfully to two real-world systems. We also compared Glacier to Java’s final in a user study of twenty participants. We found that even after being given instructions on how to express immutability with final, participants who used final were unable to express immutability correctly, whereas almost all participants who used Glacier succeeded. We also asked participants to make specific changes to immutable classes and found that participants who used final all incorrectly mutated immutable state, whereas almost all of the participants who used Glacier succeeded. Glacier represents a promising approach to enforcing immutability in Java and provides a model for enforcement in other languages.

Cyrus Omar, Ian Voysey, Michael Hilton, Joshua Sunshine, Claire Le Goues, Jonathan Aldrich, Matthew Hammer.  2017.  Toward Semantic Foundations for Program Editors. 2nd Summit on Advances in Programming Languages (SNAPL 2017).

Programming language definitions assign formal meaning to complete programs. Programmers, however, spend a substantial amount of time interacting with incomplete programs -- programs with holes, type inconsistencies and binding inconsistencies -- using tools like program editors and live programming environments (which interleave editing and evaluation). Semanticists have done comparatively little to formally characterize (1) the static and dynamic semantics of incomplete programs; (2) the actions available to programmers as they edit and inspect incomplete programs; and (3) the behavior of editor services that suggest likely edit actions to the programmer based on semantic information extracted from the incomplete program being edited, and from programs that the system has encountered in the past. As such, each tool designer has largely been left to develop their own ad hoc heuristics. 
This paper serves as a vision statement for a research program that seeks to develop these "missing" semantic foundations. Our hope is that these contributions, which will take the form of a series of simple formal calculi equipped with a tractable metatheory, will guide the design of a variety of current and future interactive programming tools, much as various lambda calculi have guided modern language designs. Our own research will apply these principles in the design of Hazel, an experimental live lab notebook programming environment designed for data science tasks. We plan to co-design the Hazel language with the editor so that we can explore concepts such as edit-time semantic conflict resolution mechanisms and mechanisms that allow library providers to install library-specific editor services.

Darya Melicher(Kurilova), Yangqingwei Shi, Alex Potanin, Jonathan Aldrich.  2017.  A Capability-Based Module System for Authority Control. European Conference on Object-Oriented Programming (ECOOP).

The principle of least authority states that each component of the system should be given authority to access only the information and resources that it needs for its operation. This principle is fundamental to the secure design of software systems, as it helps to limit an application’s attack surface and to isolate vulnerabilities and faults. Unfortunately, current programming languages do not provide adequate help in controlling the authority of application modules, an issue that is particularly acute in the case of untrusted third-party extensions. In this paper, we present a language design that facilitates controlling the authority granted to each application module. The key technical novelty of our approach is that modules are firstclass, statically typed capabilities. First-class modules are essentially objects, and so we formalize our module system by translation into an object calculus and prove that the core calculus is typesafe and authority-safe. Unlike prior formalizations, our work defines authority non-transitively, allowing engineers to reason about software designs that use wrappers to provide an attenuated version of a more powerful capability. Our approach allows developers to determine a module’s authority by examining the capabilities passed as module arguments when the module is created, or delegated to the module later during execution. The type system facilitates this by identifying which objects provide capabilities to sensitive resources, and by enabling security architects to examine the capabilities passed into and out of a module based only on the module’s interface, without needing to examine the module’s implementation code. An implementation of the module system and illustrative examples in the Wyvern programming language suggest that our approach can be a practical way to control module authority.

2017-04-10
Hemank Lamba, Thomas J. Glazier, Javier Camara, Bradley Schmerl, David Garlan, Jurgen Pfeffer.  2017.  Model-based Cluster Analysis for Identifying Suspicious Activity Sequences in Software. IWSPA '17 Proceedings of the 3rd ACM on International Workshop on Security And Privacy Analytics.

Large software systems have to contend with a significant number of users who interact with different components of the system in various ways. The sequences of components that are used as part of an interaction define sets of behaviors that users have with the system. These can be large in number. Among these users, it is possible that there are some who exhibit anomalous behaviors -- for example, they may have found back doors into the system and are doing something malicious. These anomalous behaviors can be hard to distinguish from normal behavior because of the number of interactions a system may have, or because traces may deviate only slightly from normal behavior. In this paper we describe a model-based approach to cluster sequences of user behaviors within a system and to find suspicious, or anomalous, sequences. We exploit the underlying software architecture of a system to define these sequences. We further show that our approach is better at detecting suspicious activities than other approaches, specifically those that use unigrams and bigrams for anomaly detection. We show this on a simulation of a large scale system based on Amazon Web application style architecture.

Ian Voysey, Cyrus Omar, Matthew Hammer.  2017.  Running Incomplete Programs. POPL 2017 Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages.

We typically only consider running programs that are completely written. Programmers end up inserting ad hoc dummy values into their incomplete programs to receive feedback about dynamic behavior. In this work we suggest an evaluation mechanism for incomplete programs, represented as terms with holes. Rather than immediately failing when a hole is encountered, evaluation propagates holes as far as possible. The result is a substantially tighter development loop. 

Cyrus Omar, Ian Voysey, Michael Hilton, Jonathan Aldrich, Matthew Hammer.  2017.  Hazelnut: a bidirectionally typed structure editor calculus. POPL 2017 Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages.

Structure editors allow programmers to edit the tree structure of a program directly. This can have cognitive benefits, particularly for novice and end-user programmers. It also simplifies matters for tool designers, because they do not need to contend with malformed program text. This paper introduces Hazelnut, a structure editor based on a small bidirectionally typed lambda calculus extended with holes and a cursor. Hazelnut goes one step beyond syntactic well-formedness: its edit actions operate over statically meaningful incomplete terms. Naïvely, this would force the programmer to construct terms in a rigid “outside-in” manner. To avoid this problem, the action semantics automatically places terms assigned a type that is inconsistent with the expected type inside a hole. This meaningfully defers the type consistency check until the term inside the hole is finished. Hazelnut is not intended as an end-user tool itself. Instead, it serves as a foundational account of typed structure editing. To that end, we describe how Hazelnut’s rich metatheory, which we have mechanized using the Agda proof assistant, serves as a guide when we extend the calculus to include binary sum types. We also discuss various interpretations of holes, and in so doing reveal connections with gradual typing and contextual modal type theory, the Curry-Howard interpretation of contextual modal logic. Finally, we discuss how Hazelnut’s semantics lends itself to implementation as an event-based functional reactive program. Our simple reference implementation is written using js_of_ocaml. 

Flavio Medeiros, Marcio Ribeiro, Rohit Gheyi, Sven Apel, Christian Kästner, Bruno Ferreira, Luiz Carvalho, Baldoino Fonseca.  2017.  Discipline Matters: Refactoring of Preprocessor Directives in the #ifdef Hell. IEEE Transactions on Software Engineering . (99)

The C preprocessor is used in many C projects to support variability and portability. However, researchers and practitioners criticize the C preprocessor because of its negative effect on code understanding and maintainability and its error proneness. More importantly, the use of the preprocessor hinders the development of tool support that is standard in other languages, such as automated refactoring. Developers aggravate these problems when using the preprocessor in undisciplined ways (e.g., conditional blocks that do not align with the syntactic structure of the code). In this article, we proposed a catalogue of refactorings and we evaluated the number of application possibilities of the refactorings in practice, the opinion of developers about the usefulness of the refactorings, and whether the refactorings preserve behavior. Overall, we found 5670 application possibilities for the refactorings in 63 real-world C projects. In addition, we performed an online survey among 246 developers, and we submitted 28 patches to convert undisciplined directives into disciplined ones. According to our results, 63% of developers prefer to use the refactored (i.e., disciplined) version of the code instead of the original code with undisciplined preprocessor usage. To verify that the refactorings are indeed behavior preserving, we applied them to more than 36 thousand programs generated automatically using a model of a subset of the C language, running the same test cases in the original and refactored programs. Furthermore, we applied the refactorings to three real-world projects: BusyBox, OpenSSL, and SQLite. This way, we detected and fixed a few behavioral changes, 62% caused by unspecified behavior in the C language.

2017-01-10
Jonathan Aldrich, Alex Potanin.  2016.  Naturally Embedded DSLs. Systems, Programming, Languages and Applications: Software for Humanity (SPLASH) .

Domain-specific languages can be embedded in a variety of ways within a host language. The choice of embedding approach entails significant tradeoffs in the usability of the embedded DSL. We argue embedding DSLs \textit{naturally} within the host language results in the best experience for end users of the DSL. A \textit{naturally embedded DSL} is one that uses natural syntax, static semantics, and dynamic semantics for the DSL, all of which may differ from the host language. Furthermore, it must be possible to use DSLs together naturally - meaning that different DSLs cannot conflict, and the programmer can easily tell which code is written in which language.

2017-01-09
Esther Wang, Jonathan Aldrich.  2016.  Capability Safe Reflection for the Wyvern Language. SPLASH 2016.

Reflection allows a program to examine and even modify itself, but its power can also lead to violations of encapsulation and even security vulnerabilities. The Wyvern language leverages static types for encapsulation and provides security through an object capability model. We present a design for reflection in Wyvern which respects capability safety and type-based encapsulation. This is accomplished through a mirror-based design, with the addition of a mechanism to constrain the visible type of a reflected object. In this way, we ensure that the programmer cannot use reflection to violate basic encapsulation and security guarantees.

Alireza Sadeghi, Hamid Bagheri, Joshua Garcia, Sam Malek.  2016.  A Taxonomy and Qualitative Comparison of Program Analysis Techniques for Security Assessment of Android Software. IEEE TRANSACTIONS ON SOFTWARE ENGINEERING. 99

In parallel with the meteoric rise of mobile software, we are witnessing an alarming escalation in the number and sophistication of the security threats targeted at mobile platforms, particularly Android, as the dominant platform. While existing research has made significant progress towards detection and mitigation of Android security, gaps and challenges remain. This paper contributes a comprehensive taxonomy to classify and characterize the state-of-the-art research in this area. We have carefully followed the systematic literature review process, and analyzed the results of more than 300 research papers, resulting in the most comprehensive and elaborate investigation of the literature in this area of research. The systematic analysis of the research literature has revealed patterns, trends, and gaps in

Jafar Al-Kofahi, Tien Nguyen, Christian Kästner.  2016.  Escaping AutoHell: a vision for automated analysis and migration of autotools build systems. RELENG 2016 Proceedings of the 4th International Workshop on Release Engineering.

GNU Autotools is a widely used build tool in the open source community. As open source projects grow more complex, maintaining their build systems becomes more challenging, due to the lack of tool support. In this paper, we propose a platform to build support tools for GNU Autotools build systems. The platform provides an abstraction of the build system to be used in different analysis techniques.

Meng Meng, Jens Meinicke, Chu-Pan Wong, Eric Walkingshaw, Christian Kästner.  2017.  A Choice of Variational Stacks: Exploring Variational Data Structures. 11th International Workshop on Variability Modelling of Software-intensive Systems (VAMOS).

Many applications require not only representing variability in software and data, but also computing with it. To do so efficiently requires variational data structures that make the variability explicit in the underlying data and the operations used to manipulate it. Variational data structures have been developed ad hoc for many applications, but there is little general understanding of how to design them or what tradeoffs exist among them. In this paper, we strive for a more systematic exploration and analysis of a variational data structure. We want to know how different design decisions affect the performance and scalability of a variational data structure, and what properties of the underlying data and operation sequences need to be considered. Specifically, we study several alternative designs of a variational stack, a data structure that supports efficiently representing and computing with multiple variants of a plain stack, and that is a common building block in many algorithms. The different variational stacks are presented as a small product line organized by three design decisions. We analyze how these design decisions affect the performance of a variational stack with different usage profiles. Finally, we evaluate how these design decisions affect the performance of the variational stack in a real-world scenario: in the interpreter VarexJ when executing real software containing variability. 

2017-01-05
Jaspreet Bhatia, Travis Breaux, Joel Reidenberg, Thomas Norton.  2016.  A Theory of Vagueness and Privacy Risk Perception. 2016 IEEE 24th International Requirements Engineering Conference (RE).

Ambiguity arises in requirements when astatement is unintentionally or otherwise incomplete, missing information, or when a word or phrase has morethan one possible meaning. For web-based and mobileinformation systems, ambiguity, and vagueness inparticular, undermines the ability of organizations to aligntheir privacy policies with their data practices, which canconfuse or mislead users thus leading to an increase inprivacy risk. In this paper, we introduce a theory ofvagueness for privacy policy statements based on ataxonomy of vague terms derived from an empiricalcontent analysis of 15 privacy policies. The taxonomy wasevaluated in a paired comparison experiment and resultswere analyzed using the Bradley-Terry model to yield arank order of vague terms in both isolation andcomposition. The theory predicts how vague modifiers toinformation actions and information types can becomposed to increase or decrease overall vagueness. Wefurther provide empirical evidence based on factorialvignette surveys to show how increases in vagueness willdecrease users' acceptance of privacy risk and thusdecrease users' willingness to share personal information.

2016-12-08
Lichao Sun, Zhiqiang Li, Qiben Yan, Witawas Srisa-an, Yu Pan.  2016.  SigPID: Significant Permission Identification for Android Malware Detection. 11th International Conference on Malicious and Unwanted Software (MALCON 2016).

A recent report indicates that a newly developed mali- cious app for Android is introduced every 11 seconds.  To combat this alarming rate of malware creation,  we need a scalable malware detection approach that is effective and efficient. In this paper, we introduce SIGPID, a malware detection system based on permission  analysis to cope with the rapid increase in the number of Android malware. In- stead of analyzing all 135 Android permissions, our ap- proach applies 3-level pruning by mining the permission data to identify only significant permissions that can be ef- fective in distinguishing benign and malicious apps. SIG- PID then utilizes classification algorithms to classify differ- ent families of malware and benign apps. Our evaluation finds that only 22 out of 135 permissions are significant. We then compare the performance of our approach, using only

22 permissions, against a baseline approach that analyzes all permissions. The results indicate that when Support Vec- tor Machine (SVM) is used as the classifier, we can achieve over 90% of precision, recall, accuracy, and F-measure, which  are about the same as those produced by the base- line approach while incurring the analysis times that are 4 to 32 times smaller that those of using all 135 permissions. When we compare the detection effectiveness of SIGPID to those of other approaches, SIGPID can detect 93.62% of malware in the data set, and 91.4% unknown malware.