Biblio

Found 2356 results

Filters: Keyword is privacy  [Clear All Filters]
2020-04-17
Huang, Hua, Zhang, Yi-lai, Zhang, Min.  2019.  Research on Cloud Workflow Engine Supporting Three-Level Isolation and Privacy Protection. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :160—165.

With the development of cloud computing, cloud workflow systems are widely accepted by more and more enterprises and individuals (namely tenants). There exists mass tenant workflow instances running in cloud workflow systems. How to implement the three-level (i.e., data, performance, execution ) isolation and privacy protection among these tenant workflow instances is challenging. To address this issue, this paper presents a novel cloud workflow model supporting multi-tenants with privacy protection. With the presented model, a framework of cloud workflow engine based on the extended jBPM4 is proposed by adopting layered management thought, virtualization technology and sandbox mechanism. By extending the jBPM4 (java Business Process Management) engine, the prototype system of the proposed cloud workflow engine is implemented and applied in the ceramic cloud service platform (denoted as CCSP). The application effect demonstrates that our proposal can be used to implement the three-level isolation and privacy protection between mass various tenant workflow instances in cloud workflow systems.

2020-03-02
Dauterman, Emma, Corrigan-Gibbs, Henry, Mazières, David, Boneh, Dan, Rizzo, Dominic.  2019.  True2F: Backdoor-Resistant Authentication Tokens. 2019 IEEE Symposium on Security and Privacy (SP). :398–416.
We present True2F, a system for second-factor authentication that provides the benefits of conventional authentication tokens in the face of phishing and software compromise, while also providing strong protection against token faults and backdoors. To do so, we develop new lightweight two-party protocols for generating cryptographic keys and ECDSA signatures, and we implement new privacy defenses to prevent cross-origin token-fingerprinting attacks. To facilitate real-world deployment, our system is backwards-compatible with today's U2F-enabled web services and runs on commodity hardware tokens after a firmware modification. A True2F-protected authentication takes just 57ms to complete on the token, compared with 23ms for unprotected U2F.
2020-02-10
Simos, Dimitris E., Zivanovic, Jovan, Leithner, Manuel.  2019.  Automated Combinatorial Testing for Detecting SQL Vulnerabilities in Web Applications. 2019 IEEE/ACM 14th International Workshop on Automation of Software Test (AST). :55–61.

In this paper, we present a combinatorial testing methodology for testing web applications in regards to SQL injection vulnerabilities. We describe three attack grammars that were developed and used to generate concrete attack vectors. Furthermore, we present and evaluate two different oracles used to observe the application's behavior when subjected to such attack vectors. We also present a prototype tool called SQLInjector capable of automated SQL injection vulnerability testing for web applications. The developed methodology can be applied to any web application that uses server side scripting and HTML for handling user input and has a SQL database backend. Our approach relies on the use of a database proxy, making this a gray-box testing method. We establish the effectiveness of the proposed tool with the WAVSEP verification framework and conduct a case study on real-world web applications, where we are able to discover both known vulnerabilities and additional previously undiscovered flaws.

2020-11-04
Ajjimaporn, P., Gibbons, M., Stoick, B., Straub, J..  2019.  Automated Student Assessment for Cybersecurity Courses. 2019 14th Annual Conference System of Systems Engineering (SoSE). :93—95.

The need for cybersecurity knowledge and skills is constantly growing as our lives become more integrated with the digital world. In order to meet this demand, educational institutions must continue to innovate within the field of cybersecurity education and make this educational process as effective and efficient as possible. We seek to accomplish this goal by taking an existing cybersecurity educational technology and adding automated grading and assessment functionality to it. This will reduce costs and maximize scalability by reducing or even eliminating the need for human graders.

2020-02-18
Nasr, Milad, Shokri, Reza, Houmansadr, Amir.  2019.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-Box Inference Attacks against Centralized and Federated Learning. 2019 IEEE Symposium on Security and Privacy (SP). :739–753.

Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

2020-12-02
Narang, S., Byali, M., Dayama, P., Pandit, V., Narahari, Y..  2019.  Design of Trusted B2B Market Platforms using Permissioned Blockchains and Game Theory. 2019 IEEE International Conference on Blockchain and Cryptocurrency (ICBC). :385—393.

Trusted collaboration satisfying the requirements of (a) adequate transparency and (b) preservation of privacy of business sensitive information is a key factor to ensure the success and adoption of online business-to-business (B2B) collaboration platforms. Our work proposes novel ways of stringing together game theoretic modeling, blockchain technology, and cryptographic techniques to build such a platform for B2B collaboration involving enterprise buyers and sellers who may be strategic. The B2B platform builds upon three ideas. The first is to use a permissioned blockchain with smart contracts as the technical infrastructure for building the platform. Second, the above smart contracts implement deep business logic which is derived using a rigorous analysis of a repeated game model of the strategic interactions between buyers and sellers to devise strategies to induce honest behavior from buyers and sellers. Third, we present a formal framework that captures the essential requirements for secure and private B2B collaboration, and, in this direction, we develop cryptographic regulation protocols that, in conjunction with the blockchain, help implement such a framework. We believe our work is an important first step in the direction of building a platform that enables B2B collaboration among strategic and competitive agents while maximizing social welfare and addressing the privacy concerns of the agents.

2020-07-09
Kassem, Ali, Ács, Gergely, Castelluccia, Claude, Palamidessi, Catuscia.  2019.  Differential Inference Testing: A Practical Approach to Evaluate Sanitizations of Datasets. 2019 IEEE Security and Privacy Workshops (SPW). :72—79.

In order to protect individuals' privacy, data have to be "well-sanitized" before sharing them, i.e. one has to remove any personal information before sharing data. However, it is not always clear when data shall be deemed well-sanitized. In this paper, we argue that the evaluation of sanitized data should be based on whether the data allows the inference of sensitive information that is specific to an individual, instead of being centered around the concept of re-identification. We propose a framework to evaluate the effectiveness of different sanitization techniques on a given dataset by measuring how much an individual's record from the sanitized dataset influences the inference of his/her own sensitive attribute. Our intent is not to accurately predict any sensitive attribute but rather to measure the impact of a single record on the inference of sensitive information. We demonstrate our approach by sanitizing two real datasets in different privacy models and evaluate/compare each sanitized dataset in our framework.

2020-12-11
Fan, M., Luo, X., Liu, J., Wang, M., Nong, C., Zheng, Q., Liu, T..  2019.  Graph Embedding Based Familial Analysis of Android Malware using Unsupervised Learning. 2019 IEEE/ACM 41st International Conference on Software Engineering (ICSE). :771—782.

The rapid growth of Android malware has posed severe security threats to smartphone users. On the basis of the familial trait of Android malware observed by previous work, the familial analysis is a promising way to help analysts better focus on the commonalities of malware samples within the same families, thus reducing the analytical workload and accelerating malware analysis. The majority of existing approaches rely on supervised learning and face three main challenges, i.e., low accuracy, low efficiency, and the lack of labeled dataset. To address these challenges, we first construct a fine-grained behavior model by abstracting the program semantics into a set of subgraphs. Then, we propose SRA, a novel feature that depicts the similarity relationships between the Structural Roles of sensitive API call nodes in subgraphs. An SRA is obtained based on graph embedding techniques and represented as a vector, thus we can effectively reduce the high complexity of graph matching. After that, instead of training a classifier with labeled samples, we construct malware link network based on SRAs and apply community detection algorithms on it to group the unlabeled samples into groups. We implement these ideas in a system called GefDroid that performs Graph embedding based familial analysis of AnDroid malware using unsupervised learning. Moreover, we conduct extensive experiments to evaluate GefDroid on three datasets with ground truth. The results show that GefDroid can achieve high agreements (0.707-0.883 in term of NMI) between the clustering results and the ground truth. Furthermore, GefDroid requires only linear run-time overhead and takes around 8.6s to analyze a sample on average, which is considerably faster than the previous work.

Ghose, N., Lazos, L., Rozenblit, J., Breiger, R..  2019.  Multimodal Graph Analysis of Cyber Attacks. 2019 Spring Simulation Conference (SpringSim). :1—12.

The limited information on the cyberattacks available in the unclassified regime, hardens standardizing the analysis. We address the problem of modeling and analyzing cyberattacks using a multimodal graph approach. We formulate the stages, actors, and outcomes of cyberattacks as a multimodal graph. Multimodal graph nodes include cyberattack victims, adversaries, autonomous systems, and the observed cyber events. In multimodal graphs, single-modality graphs are interconnected according to their interaction. We apply community and centrality analysis on the graph to obtain in-depth insights into the attack. In community analysis, we cluster those nodes that exhibit “strong” inter-modal ties. We further use centrality to rank the nodes according to their importance. Classifying nodes according to centrality provides the progression of the attack from the attacker to the targeted nodes. We apply our methods to two popular case studies, namely GhostNet and Putter Panda and demonstrate a clear distinction in the attack stages.

2020-01-28
Pajola, Luca, Pasa, Luca, Conti, Mauro.  2019.  Threat Is in the Air: Machine Learning for Wireless Network Applications. Proceedings of the ACM Workshop on Wireless Security and Machine Learning. :16–21.

With the spread of wireless application, huge amount of data is generated every day. Thanks to its elasticity, machine learning is becoming a fundamental brick in this field, and many of applications are developed with the use of it and the several techniques that it offers. However, machine learning suffers on different problems and people that use it often are not aware of the possible threats. Often, an adversary tries to exploit these vulnerabilities in order to obtain benefits; because of this, adversarial machine learning is becoming wide studied in the scientific community. In this paper, we show state-of-the-art adversarial techniques and possible countermeasures, with the aim of warning people regarding sensible argument related to the machine learning.

2020-04-24
Smychkova, Anna, Zhukov, Dmitry.  2019.  Complex of Description Models for Analysis and Control Group Behavior Based on Stochastic Cellular Automata with Memory and Systems of Differential Kinetic Equations. 2019 1st International Conference on Control Systems, Mathematical Modelling, Automation and Energy Efficiency (SUMMA). :218—223.

This paper considers the complex of models for the description, analysis, and modeling of group behavior by user actions in complex social systems. In particular, electoral processes can be considered in which preferences are selected from several possible ones. For example, for two candidates, the choice is made from three states: for the candidate A, for candidate B and undecided (candidate C). Thus, any of the voters can be in one of the three states, and the interaction between them leads to the transition between the states with some delay time intervals, which are one of the parameters of the proposed models. The dynamics of changes in the preferences of voters can be described graphically on diagram of possible transitions between states, on the basis of which is possible to write a system of differential kinetic equations that describes the process. The analysis of the obtained solutions shows the possibility of existence within the model, different modes of changing the preferences of voters. In the developed model of stochastic cellular automata with variable memory at each step of the interaction process between its cells, a new network of random links is established, the minimum and the maximum number of which is selected from a given range. At the initial time, a cell of each type is assigned a numeric parameter that specifies the number of steps during which will retain its type (cell memory). The transition of cells between states is determined by the total number of cells of different types with which there was interaction at the given number of memory steps. After the number of steps equal to the depth of memory, transition to the type that had the maximum value of its sum occurs. The effect of external factors (such as media) on changes in node types can set for each step using a transition probability matrix. Processing of the electoral campaign's sociological data of 2015-2016 at the choice of the President of the United States using the method of almost-periodic functions allowed to estimate the parameters of a set of models and use them to describe, analyze and model the group behavior of voters. The studies show a good correspondence between the data observed in sociology and calculations using a set of developed models. Under some sets of values of the coefficients in the differential equations and models of cellular automata are observed the oscillating and almost-periodic character of changes in the preferences of the electorate, which largely coincides with the real observations.

2020-12-07
Silva, J. L. da, Assis, M. M., Braga, A., Moraes, R..  2019.  Deploying Privacy as a Service within a Cloud-Based Framework. 2019 9th Latin-American Symposium on Dependable Computing (LADC). :1–4.
Continuous monitoring and risk assessment of privacy violations on cloud systems are needed by anyone who has business needs subject to privacy regulations. Compliance to such regulations in dynamic systems demands appropriate techniques, tools and instruments. As a Service concepts can be a good option to support this task. Previous work presented PRIVAaaS, a software toolkit that allows controlling and reducing data leakages, thus preserving privacy, by providing anonymization capabilities to query-based systems. This short paper discusses the implementation details and deployment environment of an evolution of PRIVAaaS as a MAPE-K control loop within the ATMOSPHERE Platform. ATMOSPHERE is both a framework and a platform enabling the implementation of trustworthy cloud services. By enabling PRIVAaaS within ATMOSPHERE, privacy is made one of several trustworthiness properties continuously monitored and assessed by the platform with a software-based, feedback control loop known as MAPE-K.
2020-02-10
Hasan, Musaab, Balbahaith, Zayed, Tarique, Mohammed.  2019.  Detection of SQL Injection Attacks: A Machine Learning Approach. 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA). :1–6.
With the rapid growth in online services, hacking (alternatively attacking) on online database applications has become a grave concern now. Attacks on online database application are being frequently reported. Among these attacks, the SQL injection attack is at the top of the list. The hackers alter the SQL query sent by the user and inject malicious code therein. Hence, they access the database and manipulate the data. It is reported in the literature that the traditional SQL injection detection algorithms fail to prevent this type of attack. In this paper, we propose a machine learning based heuristic algorithm to prevent the SQL injection attack. We use a dataset of 616 SQL statements to train and test 23 different machine learning classifiers. Among these classifiers, we select the best five classifiers based on their detection accuracy and develop a Graphical User Interface (GUI) application based on these five classifiers. We test our proposed algorithm and the results show that our algorithm is able to detect the SQL injection attack with a high accuracy (93.8%).
2020-08-07
Ramezanian, Sara, Niemi, Valtteri.  2019.  Privacy Preserving Cyberbullying Prevention with AI Methods in 5G Networks. 2019 25th Conference of Open Innovations Association (FRUCT). :265—271.
Children and teenagers that have been a victim of bullying can possibly suffer its psychological effects for a lifetime. With the increase of online social media, cyberbullying incidents have been increased as well. In this paper we discuss how we can detect cyberbullying with AI techniques, using term frequency-inverse document frequency. We label messages as benign or bully. We want our method of cyberbullying detection to be privacy-preserving, such that the subscribers' benign messages should not be revealed to the operator. Moreover, the operator labels subscribers as normal, bully and victim. The operator utilizes policy control in 5G networks, to protect victims of cyberbullying from harmful traffic.
2020-08-10
Quijano, Andrew, Akkaya, Kemal.  2019.  Server-Side Fingerprint-Based Indoor Localization Using Encrypted Sorting. 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW). :53–57.
GPS signals, the main origin of navigation, are not functional in indoor environments. Therefore, Wi-Fi access points have started to be increasingly used for localization and tracking inside the buildings by relying on fingerprint-based approach. However, with these types of approaches, several concerns regarding the privacy of the users have arisen. Malicious individuals can determine a clients daily habits and activities by simply analyzing their wireless signals. While there are already efforts to incorporate privacy to the existing fingerprint-based approaches, they are limited to the characteristics of the homo-morphic cryptographic schemes they employed. In this paper, we propose to enhance the performance of these approaches by exploiting another homomorphic algorithm, namely DGK, with its unique encrypted sorting capability and thus pushing most of the computations to the server side. We developed an Android app and tested our system within a Columbia University dormitory. Compared to existing systems, the results indicated that more power savings can be achieved at the client side and DGK can be a viable option with more powerful server computation capabilities.
2020-10-26
Li, Huhua, Zhan, Dongyang, Liu, Tianrui, Ye, Lin.  2019.  Using Deep-Learning-Based Memory Analysis for Malware Detection in Cloud. 2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems Workshops (MASSW). :1–6.
Malware is one of the biggest threats in cloud computing. Malware running inside virtual machines or containers could steal critical information or continue to attack other cloud nodes. To detect malware in cloud, especially zero-day malware, signature-and machine-learning-based approaches are proposed to analyze the execution binary. However, malicious binary files may not permanently be stored in the file system of virtual machine or container, periodically scanner may not find the target files. Dynamic analysis approach usually introduce run-time overhead to virtual machines, which is not widely used in cloud. To solve these problems, we propose a memory analysis approach to detect malware, employing the deep learning technology. The system analyzes the memory image periodically during malware execution, which will not introduce run-time overhead. We first extract the memory snapshot from running virtual machines or containers. Then, the snapshot is converted to a grayscale image. Finally, we employ CNN to detect malware. In the learning phase, malicious and benign software are trained. In the testing phase, we test our system with real-world malwares.
2020-12-11
Slawinski, M., Wortman, A..  2019.  Applications of Graph Integration to Function Comparison and Malware Classification. 2019 4th International Conference on System Reliability and Safety (ICSRS). :16—24.

We classify .NET files as either benign or malicious by examining directed graphs derived from the set of functions comprising the given file. Each graph is viewed probabilistically as a Markov chain where each node represents a code block of the corresponding function, and by computing the PageRank vector (Perron vector with transport), a probability measure can be defined over the nodes of the given graph. Each graph is vectorized by computing Lebesgue antiderivatives of hand-engineered functions defined on the vertex set of the given graph against the PageRank measure. Files are subsequently vectorized by aggregating the set of vectors corresponding to the set of graphs resulting from decompiling the given file. The result is a fast, intuitive, and easy-to-compute glass-box vectorization scheme, which can be leveraged for training a standalone classifier or to augment an existing feature space. We refer to this vectorization technique as PageRank Measure Integration Vectorization (PMIV). We demonstrate the efficacy of PMIV by training a vanilla random forest on 2.5 million samples of decompiled. NET, evenly split between benign and malicious, from our in-house corpus and compare this model to a baseline model which leverages a text-only feature space. The median time needed for decompilation and scoring was 24ms. 11Code available at https://github.com/gtownrocks/grafuple.

2020-11-04
Stange, M., Tang, C., Tucker, C., Servine, C., Geissler, M..  2019.  Cybersecurity Associate Degree Program Curriculum. 2019 IEEE International Symposium on Technologies for Homeland Security (HST). :1—5.

The spotlight is on cybersecurity education programs to develop a qualified cybersecurity workforce to meet the demand of the professional field. The ACM CCECC (Committee for Computing Education in Community Colleges) is leading the creation of a set of guidelines for associate degree cybersecurity programs called Cyber2yr, formerly known as CSEC2Y. A task force of community college educators have created a student competency focused curriculum that will serve as a global cybersecurity guide for applied (AAS) and transfer (AS) degree programs to develop a knowledgeable and capable associate level cybersecurity workforce. Based on the importance of the Cyber2yr work; ABET a nonprofit, non-governmental agency that accredits computing programs has created accreditation criteria for two-year cybersecurity programs.

2020-07-09
Ashouri, Mohammadreza.  2019.  Detecting Input Sanitization Errors in Scala. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :313—319.

Scala programming language combines object-oriented and functional programming in one concise, high-level language, and the language supports static types that help to avoid bugs in complex programs. This paper proposes a dynamic taint analyzer called ScalaTaint for Scala applications. The analyzer traces the propagation of malicious inputs from untrusted sources to sensitive sink methods in programs that can be exploited by adversaries. In this work, we evaluated the accuracy of ScalaTaint with a security benchmark suite including 7 projects in Scala. As a result, our analyzer could report 49 vulnerabilities within 753,372 lines of code. Moreover, the result of our performance measurement on ScalaBench shows 67% runtime overhead that demonstrates the usefulness and efficiently of our technique in comparison with similar tools.

2020-11-04
Flores, P..  2019.  Digital Simulation in the Virtual World: Its Effect in the Knowledge and Attitude of Students Towards Cybersecurity. 2019 Sixth HCT Information Technology Trends (ITT). :1—5.

The search for alternative delivery modes to teaching has been one of the pressing concerns of numerous educational institutions. One key innovation to improve teaching and learning is e-learning which has undergone enormous improvements. From its focus on text-based environment, it has evolved into Virtual Learning Environments (VLEs) which provide more stimulating and immersive experiences among learners and educators. An example of VLEs is the virtual world which is an emerging educational platform among universities worldwide. One very interesting topic that can be taught using the virtual world is cybersecurity. Simulating cybersecurity in the virtual world may give a realistic experience to students which can be hardly achieved by classroom teaching. To date, there are quite a number of studies focused on cybersecurity awareness and cybersecurity behavior. But none has focused looking into the effect of digital simulation in the virtual world, as a new educational platform, in the cybersecurity attitude of the students. It is in this regard that this study has been conducted by designing simulation in the virtual world lessons that teaches the five aspects of cybersecurity namely; malware, phishing, social engineering, password usage and online scam, which are the most common cybersecurity issues. The study sought to examine the effect of this digital simulation design in the cybersecurity knowledge and attitude of the students. The result of the study ascertains that students exposed under simulation in the virtual world have a greater positive change in cybersecurity knowledge and attitude than their counterparts.

2020-10-29
Tran, Trung Kien, Sato, Hiroshi, Kubo, Masao.  2019.  Image-Based Unknown Malware Classification with Few-Shot Learning Models. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :401—407.

Knowing malware types in every malware attacks is very helpful to the administrators to have proper defense policies for their system. It must be a massive benefit for the organization as well as the social if the automatic protection systems could themselves detect, classify an existence of new malware types in the whole network system with a few malware samples. This feature helps to prevent the spreading of malware as soon as any damage is caused to the networks. An approach introduced in this paper takes advantage of One-shot/few-shot learning algorithms in solving the malware classification problems by using some well-known models such as Matching Networks, Prototypical Networks. To demonstrate an efficiency of the approach, we run the experiments on the two malware datasets (namely, MalImg and Microsoft Malware Classification Challenge), and both experiments all give us very high accuracies. We confirm that if applying models correctly from the machine learning area could bring excellent performance compared to the other traditional methods, open a new area of malware research.

2020-07-09
Feyisetan, Oluwaseyi, Diethe, Tom, Drake, Thomas.  2019.  Leveraging Hierarchical Representations for Preserving Privacy and Utility in Text. 2019 IEEE International Conference on Data Mining (ICDM). :210—219.

Guaranteeing a certain level of user privacy in an arbitrary piece of text is a challenging issue. However, with this challenge comes the potential of unlocking access to vast data stores for training machine learning models and supporting data driven decisions. We address this problem through the lens of dx-privacy, a generalization of Differential Privacy to non Hamming distance metrics. In this work, we explore word representations in Hyperbolic space as a means of preserving privacy in text. We provide a proof satisfying dx-privacy, then we define a probability distribution in Hyperbolic space and describe a way to sample from it in high dimensions. Privacy is provided by perturbing vector representations of words in high dimensional Hyperbolic space to obtain a semantic generalization. We conduct a series of experiments to demonstrate the tradeoff between privacy and utility. Our privacy experiments illustrate protections against an authorship attribution algorithm while our utility experiments highlight the minimal impact of our perturbations on several downstream machine learning models. Compared to the Euclidean baseline, we observe \textbackslashtextgreater 20x greater guarantees on expected privacy against comparable worst case statistics.

2020-02-10
Zhang, Kevin.  2019.  A Machine Learning Based Approach to Identify SQL Injection Vulnerabilities. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :1286–1288.

This paper presents a machine learning classifier designed to identify SQL injection vulnerabilities in PHP code. Both classical and deep learning based machine learning algorithms were used to train and evaluate classifier models using input validation and sanitization features extracted from source code files. On ten-fold cross validations a model trained using Convolutional Neural Network(CNN) achieved the highest precision (95.4%), while a model based on Multilayer Perceptron(MLP) achieved the highest recall (63.7%) and the highest f-measure (0.746).

2020-12-11
Wu, Y., Li, X., Zou, D., Yang, W., Zhang, X., Jin, H..  2019.  MalScan: Fast Market-Wide Mobile Malware Scanning by Social-Network Centrality Analysis. 2019 34th IEEE/ACM International Conference on Automated Software Engineering (ASE). :139—150.

Malware scanning of an app market is expected to be scalable and effective. However, existing approaches use either syntax-based features which can be evaded by transformation attacks or semantic-based features which are usually extracted by performing expensive program analysis. Therefor, in this paper, we propose a lightweight graph-based approach to perform Android malware detection. Instead of traditional heavyweight static analysis, we treat function call graphs of apps as social networks and perform social-network-based centrality analysis to represent the semantic features of the graphs. Our key insight is that centrality provides a succinct and fault-tolerant representation of graph semantics, especially for graphs with certain amount of inaccurate information (e.g., inaccurate call graphs). We implement a prototype system, MalScan, and evaluate it on datasets of 15,285 benign samples and 15,430 malicious samples. Experimental results show that MalScan is capable of detecting Android malware with up to 98% accuracy under one second which is more than 100 times faster than two state-of-the-art approaches, namely MaMaDroid and Drebin. We also demonstrate the feasibility of MalScan on market-wide malware scanning by performing a statistical study on over 3 million apps. Finally, in a corpus of dataset collected from Google-Play app market, MalScan is able to identify 18 zero-day malware including malware samples that can evade detection of existing tools.

2020-08-07
Chen, Huili, Cammarota, Rosario, Valencia, Felipe, Regazzoni, Francesco.  2019.  PlaidML-HE: Acceleration of Deep Learning Kernels to Compute on Encrypted Data. 2019 IEEE 37th International Conference on Computer Design (ICCD). :333—336.

Machine Learning as a Service (MLaaS) is becoming a popular practice where Service Consumers, e.g., end-users, send their data to a ML Service and receive the prediction outputs. However, the emerging usage of MLaaS has raised severe privacy concerns about users' proprietary data. PrivacyPreserving Machine Learning (PPML) techniques aim to incorporate cryptographic primitives such as Homomorphic Encryption (HE) and Multi-Party Computation (MPC) into ML services to address privacy concerns from a technology standpoint. Existing PPML solutions have not been widely adopted in practice due to their assumed high overhead and integration difficulty within various ML front-end frameworks as well as hardware backends. In this work, we propose PlaidML-HE, the first end-toend HE compiler for PPML inference. Leveraging the capability of Domain-Specific Languages, PlaidML-HE enables automated generation of HE kernels across diverse types of devices. We evaluate the performance of PlaidML-HE on different ML kernels and demonstrate that PlaidML-HE greatly reduces the overhead of the HE primitive compared to the existing implementations.