Biblio

List
Filter

Found 431 results

Filters: Keyword is Task Analysis [Clear All Filters]

2019-12-30

Kim, Sunbin, Kim, Hyeoncheol. 2019. Deep Explanation Model for Facial Expression Recognition Through Facial Action Coding Unit. 2019 IEEE International Conference on Big Data and Smart Computing (BigComp). :1–4.

Facial expression is the most powerful and natural non-verbal emotional communication method. Facial Expression Recognition(FER) has significance in machine learning tasks. Deep Learning models perform well in FER tasks, but it doesn't provide any justification for its decisions. Based on the hypothesis that facial expression is a combination of facial muscle movements, we find that Facial Action Coding Units(AUs) and Emotion label have a relationship in CK+ Dataset. In this paper, we propose a model which utilises AUs to explain Convolutional Neural Network(CNN) model's classification results. The CNN model is trained with CK+ Dataset and classifies emotion based on extracted features. Explanation model classifies the multiple AUs with the extracted features and emotion classes from the CNN model. Our experiment shows that with only features and emotion classes obtained from the CNN model, Explanation model generates AUs very well.

2019-12-16

Xue, Zijun, Ko, Ting-Yu, Yuchen, Neo, Wu, Ming-Kuang Daniel, Hsieh, Chu-Cheng. 2018. Isa: Intuit Smart Agent, A Neural-Based Agent-Assist Chatbot. 2018 IEEE International Conference on Data Mining Workshops (ICDMW). :1423–1428.

Hiring seasonal workers in call centers to provide customer service is a common practice in B2C companies. The quality of service delivered by both contracting and employee customer service agents depends heavily on the domain knowledge available to them. When observing the internal group messaging channels used by agents, we found that similar questions are often asked repetitively by different agents, especially from less experienced ones. The goal of our work is to leverage the promising advances in conversational AI to provide a chatbot-like mechanism for assisting agents in promptly resolving a customer's issue. In this paper, we develop a neural-based conversational solution that employs BiLSTM with attention mechanism and demonstrate how our system boosts the effectiveness of customer support agents. In addition, we discuss the design principles and the necessary considerations for our system. We then demonstrate how our system, named "Isa" (Intuit Smart Agent), can help customer service agents provide a high-quality customer experience by reducing customer wait time and by applying the knowledge accumulated from customer interactions in future applications.

Lopes, José, Robb, David A., Ahmad, Muneeb, Liu, Xingkun, Lohan, Katrin, Hastie, Helen. 2019. Towards a Conversational Agent for Remote Robot-Human Teaming. 2019 14th ACM/IEEE International Conference on Human-Robot Interaction (HRI). :548–549.

There are many challenges when it comes to deploying robots remotely including lack of operator situation awareness and decreased trust. Here, we present a conversational agent embodied in a Furhat robot that can help with the deployment of such remote robots by facilitating teaming with varying levels of operator control.

Lin, Jerry Chun-Wei, Zhang, Yuyu, Chen, Chun-Hao, Wu, Jimmy Ming-Tai, Chen, Chien-Ming, Hong, Tzung-Pei. 2018. A Multiple Objective PSO-Based Approach for Data Sanitization. 2018 Conference on Technologies and Applications of Artificial Intelligence (TAAI). :148–151.

In this paper, a multi-objective particle swarm optimization (MOPSO)-based framework is presented to find the multiple solutions rather than a single one. The presented grid-based algorithm is used to assign the probability of the non-dominated solution for next iteration. Based on the designed algorithm, it is unnecessary to pre-define the weights of the side effects for evaluation but the non-dominated solutions can be discovered as an alternative way for data sanitization. Extensive experiments are carried on two datasets to show that the designed grid-based algorithm achieves good performance than the traditional single-objective evolution algorithms.

2019-12-11

Canetti, Ran, Stoughton, Alley, Varia, Mayank. 2019. EasyUC: Using EasyCrypt to Mechanize Proofs of Universally Composable Security. 2019 IEEE 32nd Computer Security Foundations Symposium (CSF). :167–16716.

We present a methodology for using the EasyCrypt proof assistant (originally designed for mechanizing the generation of proofs of game-based security of cryptographic schemes and protocols) to mechanize proofs of security of cryptographic protocols within the universally composable (UC) security framework. This allows, for the first time, the mechanization and formal verification of the entire sequence of steps needed for proving simulation-based security in a modular way: Specifying a protocol and the desired ideal functionality; Constructing a simulator and demonstrating its validity, via reduction to hard computational problems; Invoking the universal composition operation and demonstrating that it indeed preserves security. We demonstrate our methodology on a simple example: stating and proving the security of secure message communication via a one-time pad, where the key comes from a Diffie-Hellman key-exchange, assuming ideally authenticated communication. We first put together EasyCrypt-verified proofs that: (a) the Diffie-Hellman protocol UC-realizes an ideal key-exchange functionality, assuming hardness of the Decisional Diffie-Hellman problem, and (b) one-time-pad encryption, with a key obtained using ideal key-exchange, UC-realizes an ideal secure-communication functionality. We then mechanically combine the two proofs into an EasyCrypt-verified proof that the composed protocol realizes the same ideal secure-communication functionality. Although formulating a methodology that is both sound and workable has proven to be a complex task, we are hopeful that it will prove to be the basis for mechanized UC security analyses for significantly more complex protocols and tasks.

2019-12-10

Tian, Yun, Xu, Wenbo, Qin, Jing, Zhao, Xiaofan. 2018. Compressive Detection of Random Signals from Sparsely Corrupted Measurements. 2018 International Conference on Network Infrastructure and Digital Content (IC-NIDC). :389-393.

Compressed sensing (CS) integrates sampling and compression into a single step to reduce the processed data amount. However, the CS reconstruction generally suffers from high complexity. To solve this problem, compressive signal processing (CSP) is recently proposed to implement some signal processing tasks directly in the compressive domain without reconstruction. Among various CSP techniques, compressive detection achieves the signal detection based on the CS measurements. This paper investigates the compressive detection problem of random signals when the measurements are corrupted. Different from the current studies that only consider the dense noise, our study considers both the dense noise and sparse error. The theoretical performance is derived, and simulations are provided to verify the derived theoretical results.

2019-12-09

Li, Wenjuan, Cao, Jian, Hu, Keyong, Xu, Jie, Buyya, Rajkumar. 2019. A Trust-Based Agent Learning Model for Service Composition in Mobile Cloud Computing Environments. IEEE Access. 7:34207–34226.

Mobile cloud computing has the features of resource constraints, openness, and uncertainty which leads to the high uncertainty on its quality of service (QoS) provision and serious security risks. Therefore, when faced with complex service requirements, an efficient and reliable service composition approach is extremely important. In addition, preference learning is also a key factor to improve user experiences. In order to address them, this paper introduces a three-layered trust-enabled service composition model for the mobile cloud computing systems. Based on the fuzzy comprehensive evaluation method, we design a novel and integrated trust management model. Service brokers are equipped with a learning module enabling them to better analyze customers' service preferences, especially in cases when the details of a service request are not totally disclosed. Because traditional methods cannot totally reflect the autonomous collaboration between the mobile cloud entities, a prototype system based on the multi-agent platform JADE is implemented to evaluate the efficiency of the proposed strategies. The experimental results show that our approach improves the transaction success rate and user satisfaction.

2019-12-05

Hayashi, Masahito. 2018. Secure Physical Layer Network Coding versus Secure Network Coding. 2018 IEEE Information Theory Workshop (ITW). :1-5.

Secure network coding realizes the secrecy of the message when the message is transmitted via noiseless network and a part of edges or a part of intermediate nodes are eavesdropped. In this framework, if the channels of the network has noise, we apply the error correction to noisy channel before applying the secure network coding. In contrast, secure physical layer network coding is a method to securely transmit a message by a combination of coding operation on nodes when the network is given as a set of noisy channels. In this paper, we give several examples of network, in which, secure physical layer network coding realizes a performance that cannot be realized by secure network coding.

2019-12-02

Yang, Shouguo, Shi, Zhiqiang, Zhang, Guodong, Li, Mingxuan, Ma, Yuan, Sun, Limin. 2019. Understand Code Style: Efficient CNN-Based Compiler Optimization Recognition System. ICC 2019 - 2019 IEEE International Conference on Communications (ICC). :1–6.

Compiler optimization level recognition can be applied to vulnerability discovery and binary analysis. Due to the exists of many different compilation optimization options, the difference in the contents of the binary file is very complicated. There are thousands of compiler optimization algorithms and multiple different processor architectures, so it is very difficult to manually analyze binary files and recognize its compiler optimization level with rules. This paper first proposes a CNN-based compiler optimization level recognition model: BinEye. The system extracts semantic and structural differences and automatically recognize the compiler optimization levels. The model is designed to be very suitable for binary file processing and is easy to understand. We built a dataset containing 80028 binary files for the model training and testing. Our proposed model achieves an accuracy of over 97%. At the same time, BinEye is a fully CNN-based system and it has a faster forward calculation speed, at least 8 times faster than the normal RNN-based model. Through our analysis of the model output, we successfully found the difference in assembly codes caused by the different compiler optimization level. This means that the model we proposed is interpretable. Based on our model, we propose a method to analyze the code differences caused by different compiler optimization levels, which has great guiding significance for analyzing closed source compilers and binary security analysis.

Elfar, Mahmoud, Zhu, Haibei, Cummings, M. L., Pajic, Miroslav. 2019. Security-Aware Synthesis of Human-UAV Protocols. 2019 International Conference on Robotics and Automation (ICRA). :8011–8017.

In this work, we synthesize collaboration protocols for human-unmanned aerial vehicle (H-UAV) command and control systems, where the human operator aids in securing the UAV by intermittently performing geolocation tasks to confirm its reported location. We first present a stochastic game-based model for the system that accounts for both the operator and an adversary capable of launching stealthy false-data injection attacks, causing the UAV to deviate from its path. We also describe a synthesis challenge due to the UAV's hidden-information constraint. Next, we perform human experiments using a developed RESCHU-SA testbed to recognize the geolocation strategies that operators adopt. Furthermore, we deploy machine learning techniques on the collected experimental data to predict the correctness of a geolocation task at a given location based on its geographical features. By representing the model as a delayed-action game and formalizing the system objectives, we utilize off-the-shelf model checkers to synthesize protocols for the human-UAV coalition that satisfy these objectives. Finally, we demonstrate the usefulness of the H-UAV protocol synthesis through a case study where the protocols are experimentally analyzed and further evaluated by human operators.

2019-11-25

Kışlal, Ahmet Oguz, Pusane, Ali Emre, Tuğcu, Tuna. 2018. A comparative analysis of channel coding for molecular communication. 2018 26th Signal Processing and Communications Applications Conference (SIU). :1–4.

Networks established among nanomachines, also called nanonetworks, are crucial since, a single nanomachine most likely cannot handle task by itself. At the nano scale, electromagnetic waves lose their effectiveness. Molecular communication via diffusion (MCvD) is a new concept that aims to solve this problem. Information is carried out by either the type of molecules, or their concentration. The robustness of this communication method, as in the example of classical communication, is very important. Channel coding is the component that make communication less erroneous. If the desired error performance is high, channel coding is mandatory. In this paper, the performance of Bose-Chaudhuri-Hocquenghem (BCH) and Reed-Solomon (RS) codes for MCvD are evaluated by simulation and results are analyzed.

Zuin, Gianlucca, Chaimowicz, Luiz, Veloso, Adriano. 2018. Learning Transferable Features For Open-Domain Question Answering. 2018 International Joint Conference on Neural Networks (IJCNN). :1–8.

Corpora used to learn open-domain Question-Answering (QA) models are typically collected from a wide variety of topics or domains. Since QA requires understanding natural language, open-domain QA models generally need very large training corpora. A simple way to alleviate data demand is to restrict the domain covered by the QA model, leading thus to domain-specific QA models. While learning improved QA models for a specific domain is still challenging due to the lack of sufficient training data in the topic of interest, additional training data can be obtained from related topic domains. Thus, instead of learning a single open-domain QA model, we investigate domain adaptation approaches in order to create multiple improved domain-specific QA models. We demonstrate that this can be achieved by stratifying the source dataset, without the need of searching for complementary data unlike many other domain adaptation approaches. We propose a deep architecture that jointly exploits convolutional and recurrent networks for learning domain-specific features while transferring domain-shared features. That is, we use transferable features to enable model adaptation from multiple source domains. We consider different transference approaches designed to learn span-level and sentence-level QA models. We found that domain-adaptation greatly improves sentence-level QA performance, and span-level QA benefits from sentence information. Finally, we also show that a simple clustering algorithm may be employed when the topic domains are unknown and the resulting loss in accuracy is negligible.

2019-11-12

Zhang, Xian, Ben, Kerong, Zeng, Jie. 2018. Cross-Entropy: A New Metric for Software Defect Prediction. 2018 IEEE International Conference on Software Quality, Reliability and Security (QRS). :111-122.

Defect prediction is an active topic in software quality assurance, which can help developers find potential bugs and make better use of resources. To improve prediction performance, this paper introduces cross-entropy, one common measure for natural language, as a new code metric into defect prediction tasks and proposes a framework called DefectLearner for this process. We first build a recurrent neural network language model to learn regularities in source code from software repository. Based on the trained model, the cross-entropy of each component can be calculated. To evaluate the discrimination for defect-proneness, cross-entropy is compared with 20 widely used metrics on 12 open-source projects. The experimental results show that cross-entropy metric is more discriminative than 50% of the traditional metrics. Besides, we combine cross-entropy with traditional metric suites together for accurate defect prediction. With cross-entropy added, the performance of prediction models is improved by an average of 2.8% in F1-score.

2019-10-22

Alzahrani, Ahmed, Johnson, Chris, Altamimi, Saad. 2018. Information security policy compliance: Investigating the role of intrinsic motivation towards policy compliance in the organisation. 2018 4th International Conference on Information Management (ICIM). :125–132.

Recent behavioral research in information security has focused on increasing employees' motivation to enhance the security performance in an organization. This empirical study investigated employees' information security policy (ISP) compliance intentions using self-determination theory (SDT). Relevant hypotheses were developed to test the proposed research model. Data obtained via a survey (N=3D407) from a Fortune 600 organization in Saudi Arabia provides empirical support for the model. The results confirmed that autonomy, competence and the concept of relatedness all positively affect employees' intentions to comply. The variable 'perceived value congruence' had a negative effect on ISP compliance intentions, and the perceived legitimacy construct did not affect employees' intentions. In general, the findings of this study suggest that SDT has value in research into employees' ISP compliance intentions.

2019-10-14

Angelini, M., Blasilli, G., Borrello, P., Coppa, E., D’Elia, D. C., Ferracci, S., Lenti, S., Santucci, G.. 2018. ROPMate: Visually Assisting the Creation of ROP-based Exploits. 2018 IEEE Symposium on Visualization for Cyber Security (VizSec). :1–8.

Exploits based on ROP (Return-Oriented Programming) are increasingly present in advanced attack scenarios. Testing systems for ROP-based attacks can be valuable for improving the security and reliability of software. In this paper, we propose ROPMATE, the first Visual Analytics system specifically designed to assist human red team ROP exploit builders. In contrast, previous ROP tools typically require users to inspect a puzzle of hundreds or thousands of lines of textual information, making it a daunting task. ROPMATE presents builders with a clear interface of well-defined and semantically meaningful gadgets, i.e., fragments of code already present in the binary application that can be chained to form fully-functional exploits. The system supports incrementally building exploits by suggesting gadget candidates filtered according to constraints on preserved registers and accessed memory. Several visual aids are offered to identify suitable gadgets and assemble them into semantically correct chains. We report on a preliminary user study that shows how ROPMATE can assist users in building ROP chains.

Rong, Z., Xie, P., Wang, J., Xu, S., Wang, Y.. 2018. Clean the Scratch Registers: A Way to Mitigate Return-Oriented Programming Attacks. 2018 IEEE 29th International Conference on Application-specific Systems, Architectures and Processors (ASAP). :1–8.

With the implementation of W ⊕ X security model on computer system, Return-Oriented Programming(ROP) has become the primary exploitation technique for adversaries. Although many solutions that defend against ROP exploits have been proposed, they still suffer from various shortcomings. In this paper, we propose a new way to mitigate ROP attacks that are based on return instructions. We clean the scratch registers which are also the parameter registers based on the features of ROP malicious code and calling convention. A prototype is implemented on x64-based Linux platform based on Pin. Preliminary experimental results show that our method can efficiently mitigate conventional ROP attacks.

2019-09-23

Chen, W., Liang, X., Li, J., Qin, H., Mu, Y., Wang, J.. 2018. Blockchain Based Provenance Sharing of Scientific Workflows. 2018 IEEE International Conference on Big Data (Big Data). :3814–3820.

In a research community, the provenance sharing of scientific workflows can enhance distributed research cooperation, experiment reproducibility verification and experiment repeatedly doing. Considering that scientists in such a community are often in a loose relation and distributed geographically, traditional centralized provenance sharing architectures have shown their disadvantages in poor trustworthiness, reliabilities and efficiency. Additionally, they are also difficult to protect the rights and interests of data providers. All these have been largely hindering the willings of distributed scientists to share their workflow provenance. Considering the big advantages of blockchain in decentralization, trustworthiness and high reliability, an approach to sharing scientific workflow provenance based on blockchain in a research community is proposed. To make the approach more practical, provenance is handled on-chain and original data is delivered off-chain. A kind of block structure to support efficient provenance storing and retrieving is designed, and an algorithm for scientists to search workflow segments from provenance as well as an algorithm for experiments backtracking are provided to enhance the experiment result sharing, save computing resource and time cost by avoiding repeated experiments as far as possible. Analyses show that the approach is efficient and effective.

Zheng, N., Alawini, A., Ives, Z. G.. 2019. Fine-Grained Provenance for Matching ETL. 2019 IEEE 35th International Conference on Data Engineering (ICDE). :184–195.

Data provenance tools capture the steps used to produce analyses. However, scientists must choose among workflow provenance systems, which allow arbitrary code but only track provenance at the granularity of files; provenance APIs, which provide tuple-level provenance, but incur overhead in all computations; and database provenance tools, which track tuple-level provenance through relational operators and support optimization, but support a limited subset of data science tasks. None of these solutions are well suited for tracing errors introduced during common ETL, record alignment, and matching tasks - for data types such as strings, images, etc. Scientists need new capabilities to identify the sources of errors, find why different code versions produce different results, and identify which parameter values affect output. We propose PROVision, a provenance-driven troubleshooting tool that supports ETL and matching computations and traces extraction of content within data objects. PROVision extends database-style provenance techniques to capture equivalences, support optimizations, and enable selective evaluation. We formalize our extensions, implement them in the PROVision system, and validate their effectiveness and scalability for common ETL and matching tasks.

2019-08-05

Headrick, W. J., Dlugosz, A., Rajcok, P.. 2018. Information Assurance in modern ATE. 2018 IEEE AUTOTESTCON. :1–4.

For modern Automatic Test Equipment (ATE) one of the most daunting tasks is now Information Assurance (IA). What was once at most a secondary item consisting mainly of installing an Anti-Virus suite is now becoming one of the most important aspects of ATE. Given the current climate of IA it has become important to ensure ATE is kept safe from any breaches of security or loss of information. Even though most ATE are not on the Internet (or even on a network for many) they are still vulnerable to some of the same attack vectors plaguing common computers and other electronic devices. This paper will discuss some of the processes and procedures which must be used to ensure that modern ATE can continue to be used to test and detect faults in the systems they are designed to test. The common items that must be considered for ATE are as follows: The ATE system must have some form of Anti-Virus (as should all computers). The ATE system should have a minimum software footprint only providing the software needed to perform the task. The ATE system should be verified to have all the Operating System (OS) settings configured pursuant to the task it is intended to perform. The ATE OS settings should include password and password expiration settings to prevent access by anyone not expected to be on the system. The ATE system software should be written and constructed such that it in itself is not readily open to attack. The ATE system should be designed in a manner such that none of the instruments in the system can easily be attacked. The ATE system should insure any paths to the outside world (such as Ethernet or USB devices) are limited to only those required to perform the task it was designed for. These and many other common configuration concerns will be discussed in the paper.

2019-07-01

Ferreyra, N. E. Díaz, Meisy, R., Heiselz, M.. 2018. At Your Own Risk: Shaping Privacy Heuristics for Online Self-Disclosure. 2018 16th Annual Conference on Privacy, Security and Trust (PST). :1-10.

Revealing private and sensitive information on Social Network Sites (SNSs) like Facebook is a common practice which sometimes results in unwanted incidents for the users. One approach for helping users to avoid regrettable scenarios is through awareness mechanisms which inform a priori about the potential privacy risks of a self-disclosure act. Privacy heuristics are instruments which describe recurrent regrettable scenarios and can support the generation of privacy awareness. One important component of a heuristic is the group of people who should not access specific private information under a certain privacy risk. However, specifying an exhaustive list of unwanted recipients for a given regrettable scenario can be a tedious task which necessarily demands the user's intervention. In this paper, we introduce an approach based on decision trees to instantiate the audience component of privacy heuristics with minor intervention from the users. We introduce Disclosure- Acceptance Trees, a data structure representative of the audience component of a heuristic and describe a method for their generation out of user-centred privacy preferences.

Li, D., Zhang, Z., Liao, W., Xu, Z.. 2018. KLRA: A Kernel Level Resource Auditing Tool For IoT Operating System Security. 2018 IEEE/ACM Symposium on Edge Computing (SEC). :427-432.

Nowadays, the rapid development of the Internet of Things facilitates human life and work, while it also brings great security risks to the society due to the frequent occurrence of various security issues. IoT device has the characteristics of large-scale deployment and single responsibility application, which makes it easy to cause a chain reaction and results in widespread privacy leakage and system security problems when the software vulnerability is identified. It is difficult to guarantee that there is no security hole in the IoT operating system which is usually designed for MCU and has no kernel mode. An alternative solution is to identify the security issues in the first time when the system is hijacked and suspend the suspicious task before it causes irreparable damage. This paper proposes KLRA (A Kernel Level Resource Auditing Tool) for IoT Operating System Security This tool collects the resource-sensitive events in the kernel and audit the the resource consumption pattern of the system at the same time. KLRA can take fine-grained events measure with low cost and report the relevant security warning in the first time when the behavior of the system is abnormal compared with daily operations for the real responsibility of this device. KLRA enables the IoT operating system for MCU to generate the security early warning and thereby provides a self-adaptive heuristic security mechanism for the entire IoT system.

2019-06-24

Chouikhi, S., Merghem-Boulahia, L., Esseghir, M.. 2018. Energy Demand Scheduling Based on Game Theory for Microgrids. 2018 IEEE International Conference on Communications (ICC). :1–6.

The advent of smart grids offers us the opportunity to better manage the electricity grids. One of the most interesting challenges in the modern grids is the consumer demand management. Indeed, the development in Information and Communication Technologies (ICTs) encourages the development of demand-side management systems. In this paper, we propose a distributed energy demand scheduling approach that uses minimal interactions between consumers to optimize the energy demand. We formulate the consumption scheduling as a constrained optimization problem and use game theory to solve this problem. On one hand, the proposed approach aims to reduce the total energy cost of a building's consumers. This imposes the cooperation between all the consumers to achieve the collective goal. On the other hand, the privacy of each user must be protected, which means that our distributed approach must operate with a minimal information exchange. The performance evaluation shows that the proposed approach reduces the total energy cost, each consumer's individual cost, as well as the peak to average ratio.

2019-06-10

Tran, T. K., Sato, H., Kubo, M.. 2018. One-Shot Learning Approach for Unknown Malware Classification. 2018 5th Asian Conference on Defense Technology (ACDT). :8-13.

Early detection of new kinds of malware always plays an important role in defending the network systems. Especially, if intelligent protection systems could themselves detect an existence of new malware types in their system, even with a very small number of malware samples, it must be a huge benefit for the organization as well as the social since it help preventing the spreading of that kind of malware. To deal with learning from few samples, term ``one-shot learning'' or ``fewshot learning'' was introduced, and mostly used in computer vision to recognize images, handwriting, etc. An approach introduced in this paper takes advantage of One-shot learning algorithms in solving the malware classification problem by using Memory Augmented Neural Network in combination with malware's API calls sequence, which is a very valuable source of information for identifying malware behavior. In addition, it also use some advantages of the development in Natural Language Processing field such as word2vec, etc. to convert those API sequences to numeric vectors before feeding to the one-shot learning network. The results confirm very good accuracies compared to the other traditional methods.

Kim, C. H., Kabanga, E. K., Kang, S.. 2018. Classifying Malware Using Convolutional Gated Neural Network. 2018 20th International Conference on Advanced Communication Technology (ICACT). :40-44.

Malware or Malicious Software, are an important threat to information technology society. Deep Neural Network has been recently achieving a great performance for the tasks of malware detection and classification. In this paper, we propose a convolutional gated recurrent neural network model that is capable of classifying malware to their respective families. The model is applied to a set of malware divided into 9 different families and that have been proposed during the Microsoft Malware Classification Challenge in 2015. The model shows an accuracy of 92.6% on the available dataset.

Xue, S., Zhang, L., Li, A., Li, X., Ruan, C., Huang, W.. 2018. AppDNA: App Behavior Profiling via Graph-Based Deep Learning. IEEE INFOCOM 2018 - IEEE Conference on Computer Communications. :1475-1483.

Better understanding of mobile applications' behaviors would lead to better malware detection/classification and better app recommendation for users. In this work, we design a framework AppDNA to automatically generate a compact representation for each app to comprehensively profile its behaviors. The behavior difference between two apps can be measured by the distance between their representations. As a result, the versatile representation can be generated once for each app, and then be used for a wide variety of objectives, including malware detection, app categorizing, plagiarism detection, etc. Based on a systematic and deep understanding of an app's behavior, we propose to perform a function-call-graph-based app profiling. We carefully design a graph-encoding method to convert a typically extremely large call-graph to a 64-dimension fix-size vector to achieve robust app profiling. Our extensive evaluations based on 86,332 benign and malicious apps demonstrate that our system performs app profiling (thus malware detection, classification, and app recommendation) to a high accuracy with extremely low computation cost: it classifies 4024 (benign/malware) apps using around 5.06 second with accuracy about 93.07%; it classifies 570 malware's family (total 21 families) using around 0.83 second with accuracy 82.3%; it classifies 9,730 apps' functionality with accuracy 33.3% for a total of 7 categories and accuracy of 88.1 % for 2 categories.