Biblio

List
Filter

Found 273 results

Filters: Keyword is Predictive models [Clear All Filters]

2021-02-03

Rabby, M. K. Monir, Khan, M. Altaf, Karimoddini, A., Jiang, S. X.. 2020. Modeling of Trust Within a Human-Robot Collaboration Framework. 2020 IEEE International Conference on Systems, Man, and Cybernetics (SMC). :4267—4272.

In this paper, a time-driven performance-aware mathematical model for trust in the robot is proposed for a Human-Robot Collaboration (HRC) framework. The proposed trust model is based on both the human operator and the robot performances. The human operator’s performance is modeled based on both the physical and cognitive performances, while the robot performance is modeled over its unpredictable, predictable, dependable, and faithful operation regions. The model is validated via different simulation scenarios. The simulation results show that the trust in the robot in the HRC framework is governed by robot performance and human operator’s performance and can be improved by enhancing the robot performance.

2021-02-01

Hou, M.. 2020. IMPACT: A Trust Model for Human-Agent Teaming. 2020 IEEE International Conference on Human-Machine Systems (ICHMS). :1–4.

A trust model IMPACT: Intention, Measurability, Predictability, Agility, Communication, and Transparency has been conceptualized to build human trust in autonomous agents. The six critical characteristics must be exhibited by the agents in order to gain and maintain the trust from their human partners towards an effective and collaborative team in achieving common goals. The IMPACT model guided a design of an intelligent adaptive decision aid for dynamic target engagement processes in a military context. Positive feedback from subject matter experts participated in a large scale joint exercise controlling multiple unmanned vehicles indicated the effectiveness of the decision aid. It also demonstrated the utility of the IMPACT model as design principles for building up a trusted human-agent teaming.

2021-01-28

Kariyappa, S., Qureshi, M. K.. 2020. Defending Against Model Stealing Attacks With Adaptive Misinformation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). :767—775.

Deep Neural Networks (DNNs) are susceptible to model stealing attacks, which allows a data-limited adversary with no knowledge of the training dataset to clone the functionality of a target model, just by using black-box query access. Such attacks are typically carried out by querying the target model using inputs that are synthetically generated or sampled from a surrogate dataset to construct a labeled dataset. The adversary can use this labeled dataset to train a clone model, which achieves a classification accuracy comparable to that of the target model. We propose "Adaptive Misinformation" to defend against such model stealing attacks. We identify that all existing model stealing attacks invariably query the target model with Out-Of-Distribution (OOD) inputs. By selectively sending incorrect predictions for OOD queries, our defense substantially degrades the accuracy of the attacker's clone model (by up to 40%), while minimally impacting the accuracy (\textbackslashtextless; 0.5%) for benign users. Compared to existing defenses, our defense has a significantly better security vs accuracy trade-off and incurs minimal computational overhead.

2021-01-25

Chen, J., Lin, X., Shi, Z., Liu, Y.. 2020. Link Prediction Adversarial Attack Via Iterative Gradient Attack. IEEE Transactions on Computational Social Systems. 7:1081–1094.

Increasing deep neural networks are applied in solving graph evolved tasks, such as node classification and link prediction. However, the vulnerability of deep models can be revealed using carefully crafted adversarial examples generated by various adversarial attack methods. To explore this security problem, we define the link prediction adversarial attack problem and put forward a novel iterative gradient attack (IGA) strategy using the gradient information in the trained graph autoencoder (GAE) model. Not surprisingly, GAE can be fooled by an adversarial graph with a few links perturbed on the clean one. The results on comprehensive experiments of different real-world graphs indicate that most deep models and even the state-of-the-art link prediction algorithms cannot escape the adversarial attack, such as GAE. We can benefit the attack as an efficient privacy protection tool from the link prediction of unknown violations. On the other hand, the adversarial attack is a robust evaluation metric for current link prediction algorithms of their defensibility.

2021-01-22

Mani, G., Pasumarti, V., Bhargava, B., Vora, F. T., MacDonald, J., King, J., Kobes, J.. 2020. DeCrypto Pro: Deep Learning Based Cryptomining Malware Detection Using Performance Counters. 2020 IEEE International Conference on Autonomic Computing and Self-Organizing Systems (ACSOS). :109—118.

Autonomy in cybersystems depends on their ability to be self-aware by understanding the intent of services and applications that are running on those systems. In case of mission-critical cybersystems that are deployed in dynamic and unpredictable environments, the newly integrated unknown applications or services can either be benign and essential for the mission or they can be cyberattacks. In some cases, these cyberattacks are evasive Advanced Persistent Threats (APTs) where the attackers remain undetected for reconnaissance in order to ascertain system features for an attack e.g. Trojan Laziok. In other cases, the attackers can use the system only for computing e.g. cryptomining malware. APTs such as cryptomining malware neither disrupt normal system functionalities nor trigger any warning signs because they simply perform bitwise and cryptographic operations as any other benign compression or encoding application. Thus, it is difficult for defense mechanisms such as antivirus applications to detect these attacks. In this paper, we propose an Operating Context profiling system based on deep neural networks-Long Short-Term Memory (LSTM) networks-using Windows Performance Counters data for detecting these evasive cryptomining applications. In addition, we propose Deep Cryptomining Profiler (DeCrypto Pro), a detection system with a novel model selection framework containing a utility function that can select a classification model for behavior profiling from both the light-weight machine learning models (Random Forest and k-Nearest Neighbors) and a deep learning model (LSTM), depending on available computing resources. Given data from performance counters, we show that individual models perform with high accuracy and can be trained with limited training data. We also show that the DeCrypto Profiler framework reduces the use of computational resources and accurately detects cryptomining applications by selecting an appropriate model, given the constraints such as data sample size and system configuration.

2021-01-11

Wang, J., Wang, A.. 2020. An Improved Collaborative Filtering Recommendation Algorithm Based on Differential Privacy. 2020 IEEE 11th International Conference on Software Engineering and Service Science (ICSESS). :310–315.

In this paper, differential privacy protection method is applied to matrix factorization method that used to solve the recommendation problem. For centralized recommendation scenarios, a collaborative filtering recommendation model based on matrix factorization is established, and a matrix factorization mechanism satisfying ε-differential privacy is proposed. Firstly, the potential characteristic matrix of users and projects is constructed. Secondly, noise is added to the matrix by the method of target disturbance, which satisfies the differential privacy constraint, then the noise matrix factorization model is obtained. The parameters of the model are obtained by the stochastic gradient descent algorithm. Finally, the differential privacy matrix factorization model is used for score prediction. The effectiveness of the algorithm is evaluated on the public datasets including Movielens and Netflix. The experimental results show that compared with the existing typical recommendation methods, the new matrix factorization method with privacy protection can recommend within a certain range of recommendation accuracy loss while protecting the users' privacy information.

2020-12-07

Xia, H., Xiao, F., Zhang, S., Hu, C., Cheng, X.. 2019. Trustworthiness Inference Framework in the Social Internet of Things: A Context-Aware Approach. IEEE INFOCOM 2019 - IEEE Conference on Computer Communications. :838–846.

The concept of social networking is integrated into Internet of things (IoT) to socialize smart objects by mimicking human behaviors, leading to a new paradigm of Social Internet of Things (SIoT). A crucial problem that needs to be solved is how to establish reliable relationships autonomously among objects, i.e., building trust. This paper focuses on exploring an efficient context-aware trustworthiness inference framework to address this issue. Based on the sociological and psychological principles of trust generation between human beings, the proposed framework divides trust into two types: familiarity trust and similarity trust. The familiarity trust can be calculated by direct trust and recommendation trust, while the similarity trust can be calculated based on external similarity trust and internal similarity trust. We subsequently present concrete methods for the calculation of different trust elements. In particular, we design a kernel-based nonlinear multivariate grey prediction model to predict the direct trust of a specific object, which acts as the core module of the entire framework. Besides, considering the fuzziness and uncertainty in the concept of trust, we introduce the fuzzy logic method to synthesize these trust elements. The experimental results verify the validity of the core module and the resistance to attacks of this framework.

2020-12-02

Abeysekara, P., Dong, H., Qin, A. K.. 2019. Machine Learning-Driven Trust Prediction for MEC-Based IoT Services. 2019 IEEE International Conference on Web Services (ICWS). :188—192.

We propose a distributed machine-learning architecture to predict trustworthiness of sensor services in Mobile Edge Computing (MEC) based Internet of Things (IoT) services, which aligns well with the goals of MEC and requirements of modern IoT systems. The proposed machine-learning architecture models training a distributed trust prediction model over a topology of MEC-environments as a Network Lasso problem, which allows simultaneous clustering and optimization on large-scale networked-graphs. We then attempt to solve it using Alternate Direction Method of Multipliers (ADMM) in a way that makes it suitable for MEC-based IoT systems. We present analytical and simulation results to show the validity and efficiency of the proposed solution.

2020-11-23

Zhu, L., Dong, H., Shen, M., Gai, K.. 2019. An Incentive Mechanism Using Shapley Value for Blockchain-Based Medical Data Sharing. 2019 IEEE 5th Intl Conference on Big Data Security on Cloud (BigDataSecurity), IEEE Intl Conference on High Performance and Smart Computing, (HPSC) and IEEE Intl Conference on Intelligent Data and Security (IDS). :113–118.

With the development of big data and machine learning techniques, medical data sharing for the use of disease diagnosis has received considerable attention. Blockchain, as an emerging technology, has been widely used to resolve the efficiency and security issues in medical data sharing. However, the existing studies on blockchain-based medical data sharing have rarely concerned about the reasonable incentive mechanism. In this paper, we propose a cooperation model where medical data is shared via blockchain. We derive the topological relationships among the participants consisting of data owners, miners and third parties, and gradually develop the computational process of Shapley value revenue distribution. Specifically, we explore the revenue distribution under different consensuses of blockchain. Finally, we demonstrate the incentive effect and rationality of the proposed solution by analyzing the revenue distribution.

2020-11-20

Sarochar, J., Acharya, I., Riggs, H., Sundararajan, A., Wei, L., Olowu, T., Sarwat, A. I.. 2019. Synthesizing Energy Consumption Data Using a Mixture Density Network Integrated with Long Short Term Memory. 2019 IEEE Green Technologies Conference(GreenTech). :1—4.

Smart cities comprise multiple critical infrastructures, two of which are the power grid and communication networks, backed by centralized data analytics and storage. To effectively model the interdependencies between these infrastructures and enable a greater understanding of how communities respond to and impact them, large amounts of varied, real-world data on residential and commercial consumer energy consumption, load patterns, and associated human behavioral impacts are required. The dissemination of such data to the research communities is, however, largely restricted because of security and privacy concerns. This paper creates an opportunity for the development and dissemination of synthetic energy consumption data which is inherently anonymous but holds similarities to the properties of real data. This paper explores a framework using mixture density network (MDN) model integrated with a multi-layered Long Short-Term Memory (LSTM) network which shows promise in this area of research. The model is trained using an initial sample recorded from residential smart meters in the state of Florida, and is used to generate fully synthetic energy consumption data. The synthesized data will be made publicly available for interested users.

Chin, J., Zufferey, T., Shyti, E., Hug, G.. 2019. Load Forecasting of Privacy-Aware Consumers. 2019 IEEE Milan PowerTech. :1—6.

The roll-out of smart meters (SMs) in the electric grid has enabled data-driven grid management and planning techniques. SM data can be used together with short-term load forecasts (STLFs) to overcome polling frequency constraints for better grid management. However, the use of SMs that report consumption data at high spatial and temporal resolutions entails consumer privacy risks, motivating work in protecting consumer privacy. The impact of privacy protection schemes on STLF accuracy is not well studied, especially for smaller aggregations of consumers, whose load profiles are subject to more volatility and are, thus, harder to predict. In this paper, we analyse the impact of two user demand shaping privacy protection schemes, model-distribution predictive control (MDPC) and load-levelling, on STLF accuracy. Support vector regression is used to predict the load profiles at different consumer aggregation levels. Results indicate that, while the MDPC algorithm marginally affects forecast accuracy for smaller consumer aggregations, this diminishes at higher aggregation levels. More importantly, the load-levelling scheme significantly improves STLF accuracy as it smoothens out the grid visible consumer load profile.

Sun, Y., Wang, J., Lu, Z.. 2019. Asynchronous Parallel Surrogate Optimization Algorithm Based on Ensemble Surrogating Model and Stochastic Response Surface Method. :74—84.

{Surrogate model-based optimization algorithm remains as an important solution to expensive black-box function optimization. The introduction of ensemble model enables the algorithm to automatically choose a proper model integration mode and adapt to various parameter spaces when dealing with different problems. However, this also significantly increases the computational burden of the algorithm. On the other hand, utilizing parallel computing resources and improving efficiency of black-box function optimization also require combination with surrogate optimization algorithm in order to design and realize an efficient parallel parameter space sampling mechanism. This paper makes use of parallel computing technology to speed up the weight updating related computation for the ensemble model based on Dempster-Shafer theory, and combines it with stochastic response surface method to develop a novel parallel sampling mechanism for asynchronous parameter optimization. Furthermore, it designs and implements corresponding parallel computing framework and applies the developed algorithm to quantitative trading strategy tuning in financial market. It is verified that the algorithm is both feasible and effective in actual application. The experiment demonstrates that with guarantee of optimizing performance, the parallel optimization algorithm can achieve excellent accelerating effect.

2020-11-09

Bouzar-Benlabiod, L., Méziani, L., Rubin, S. H., Belaidi, K., Haddar, N. E.. 2019. Variational Encoder-Decoder Recurrent Neural Network (VED-RNN) for Anomaly Prediction in a Host Environment. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :75–82.

Intrusion detection systems (IDS) are important security tools. NIDS monitors network's traffic and HIDS filters local one. HIDS are often based on anomaly detection. Several studies deal with anomaly detection using system-call traces. In this paper, we propose an anomaly detection and prediction approach. System-call traces, invoked by the running programs, are analyzed in real time. For prediction, we use a Sequence to sequence model based on variational encoder-decoder (VED) and variants of Recurrent Neural Networks (RNN), these architectures showed their performance on natural language processing. To make the analogy, we exploit the semantics behind the invoking order of system-calls that are then seen as sentences. A preprocessing phase is added to optimize the prediction model input data representation. A one-class classification is done to categorize the sequences into normal or abnormal. Tests are achieved on the ADFA-LD dataset and showed the advantage of the prediction for the intrusion detection/prediction task.

Wheelus, C., Bou-Harb, E., Zhu, X.. 2018. Tackling Class Imbalance in Cyber Security Datasets. 2018 IEEE International Conference on Information Reuse and Integration (IRI). :229–232.

It is clear that cyber-attacks are a danger that must be addressed with great resolve, as they threaten the information infrastructure upon which we all depend. Many studies have been published expressing varying levels of success with machine learning approaches to combating cyber-attacks, but many modern studies still focus on training and evaluating with very outdated datasets containing old attacks that are no longer a threat, and also lack data on new attacks. Recent datasets like UNSW-NB15 and SANTA have been produced to address this problem. Even so, these modern datasets suffer from class imbalance, which reduces the efficacy of predictive models trained using these datasets. Herein we evaluate several pre-processing methods for addressing the class imbalance problem; using several of the most popular machine learning algorithms and a variant of UNSW-NB15 based upon the attributes from the SANTA dataset.

2020-11-04

Liang, Y., He, D., Chen, D.. 2019. Poisoning Attack on Load Forecasting. 2019 IEEE Innovative Smart Grid Technologies - Asia (ISGT Asia). :1230—1235.

Short-term load forecasting systems for power grids have demonstrated high accuracy and have been widely employed for commercial use. However, classic load forecasting systems, which are based on statistical methods, are subject to vulnerability from training data poisoning. In this paper, we demonstrate a data poisoning strategy that effectively corrupts the forecasting model even in the presence of outlier detection. To the best of our knowledge, poisoning attack on short-term load forecasting with outlier detection has not been studied in previous works. Our method applies to several forecasting models, including the most widely-adapted and best-performing ones, such as multiple linear regression (MLR) and neural network (NN) models. Starting with the MLR model, we develop a novel closed-form solution to quickly estimate the new MLR model after a round of data poisoning without retraining. We then employ line search and simulated annealing to find the poisoning attack solution. Furthermore, we use the MLR attacking solution to generate a numerical solution for other models, such as NN. The effectiveness of our algorithm has been tested on the Global Energy Forecasting Competition (GEFCom2012) data set with the presence of outlier detection.

Sultana, K. Z., Williams, B. J., Bosu, A.. 2018. A Comparison of Nano-Patterns vs. Software Metrics in Vulnerability Prediction. 2018 25th Asia-Pacific Software Engineering Conference (APSEC). :355—364.

Context: Software security is an imperative aspect of software quality. Early detection of vulnerable code during development can better ensure the security of the codebase and minimize testing efforts. Although traditional software metrics are used for early detection of vulnerabilities, they do not clearly address the granularity level of the issue to precisely pinpoint vulnerabilities. The goal of this study is to employ method-level traceable patterns (nano-patterns) in vulnerability prediction and empirically compare their performance with traditional software metrics. The concept of nano-patterns is similar to design patterns, but these constructs can be automatically recognized and extracted from source code. If nano-patterns can better predict vulnerable methods compared to software metrics, they can be used in developing vulnerability prediction models with better accuracy. Aims: This study explores the performance of method-level patterns in vulnerability prediction. We also compare them with method-level software metrics. Method: We studied vulnerabilities reported for two major releases of Apache Tomcat (6 and 7), Apache CXF, and two stand-alone Java web applications. We used three machine learning techniques to predict vulnerabilities using nano-patterns as features. We applied the same techniques using method-level software metrics as features and compared their performance with nano-patterns. Results: We found that nano-patterns show lower false negative rates for classifying vulnerable methods (for Tomcat 6, 21% vs 34.7%) and therefore, have higher recall in predicting vulnerable code than the software metrics used. On the other hand, software metrics show higher precision than nano-patterns (79.4% vs 76.6%). Conclusion: In summary, we suggest developers use nano-patterns as features for vulnerability prediction to augment existing approaches as these code constructs outperform standard metrics in terms of prediction recall.

2020-10-29

Lo, Wai Weng, Yang, Xu, Wang, Yapeng. 2019. An Xception Convolutional Neural Network for Malware Classification with Transfer Learning. 2019 10th IFIP International Conference on New Technologies, Mobility and Security (NTMS). :1—5.

In this work, we applied a deep Convolutional Neural Network (CNN) with Xception model to perform malware image classification. The Xception model is a recently developed special CNN architecture that is more powerful with less over- fitting problems than the current popular CNN models such as VGG16. However only a few use cases of the Xception model can be found in literature, and it has never been used to solve the malware classification problem. The performance of our approach was compared with other methods including KNN, SVM, VGG16 etc. The experiments on two datasets (Malimg and Microsoft Malware Dataset) demonstrated that the Xception model can achieve the highest training accuracy than all other approaches including the champion approach, and highest validation accuracy than all other approaches including VGG16 model which are using image-based malware classification (except the champion solution as this information was not provided). Additionally, we proposed a novel ensemble model to combine the predictions from .bytes files and .asm files, showing that a lower logloss can be achieved. Although the champion on the Microsoft Malware Dataset achieved a bit lower logloss, our approach does not require any features engineering, making it more effective to adapt to any future evolution in malware, and very much less time consuming than the champion's solution.

2020-10-14

Wang, Yufeng, Shi, Wanjiao, Jin, Qun, Ma, Jianhua. 2019. An Accurate False Data Detection in Smart Grid Based on Residual Recurrent Neural Network and Adaptive threshold. 2019 IEEE International Conference on Energy Internet (ICEI). :499—504.

Smart grids are vulnerable to cyber-attacks, which can cause significant damage and huge economic losses. Generally, state estimation (SE) is used to observe the operation of the grid. State estimation of the grid is vulnerable to false data injection attack (FDIA), so diagnosing this type of malicious attack has a major impact on ensuring reliable operation of the power system. In this paper, we present an effective FDIA detection method based on residual recurrent neural network (R2N2) prediction model and adaptive judgment threshold. Specifically, considering the data contains both linear and nonlinear components, the R2N2 model divides the prediction process into two parts: the first part uses the linear model to fit the state data; the second part predicts the nonlinearity of the residuals of the linear prediction model. The adaptive judgment threshold is inferred through fitting the Weibull distribution with the sum of squared errors between the predicted values and observed values. The thorough simulation results demonstrate that our scheme performs better than other prediction based FDIA detection schemes.

2020-09-21

Chow, Ka-Ho, Wei, Wenqi, Wu, Yanzhao, Liu, Ling. 2019. Denoising and Verification Cross-Layer Ensemble Against Black-box Adversarial Attacks. 2019 IEEE International Conference on Big Data (Big Data). :1282–1291.

Deep neural networks (DNNs) have demonstrated impressive performance on many challenging machine learning tasks. However, DNNs are vulnerable to adversarial inputs generated by adding maliciously crafted perturbations to the benign inputs. As a growing number of attacks have been reported to generate adversarial inputs of varying sophistication, the defense-attack arms race has been accelerated. In this paper, we present MODEF, a cross-layer model diversity ensemble framework. MODEF intelligently combines unsupervised model denoising ensemble with supervised model verification ensemble by quantifying model diversity, aiming to boost the robustness of the target model against adversarial examples. Evaluated using eleven representative attacks on popular benchmark datasets, we show that MODEF achieves remarkable defense success rates, compared with existing defense methods, and provides a superior capability of repairing adversarial inputs and making correct predictions with high accuracy in the presence of black-box attacks.

2020-09-11

Shukla, Ankur, Katt, Basel, Nweke, Livinus Obiora. 2019. Vulnerability Discovery Modelling With Vulnerability Severity. 2019 IEEE Conference on Information and Communication Technology. :1—6.

Web browsers are primary targets of attacks because of their extensive uses and the fact that they interact with sensitive data. Vulnerabilities present in a web browser can pose serious risk to millions of users. Thus, it is pertinent to address these vulnerabilities to provide adequate protection for personally identifiable information. Research done in the past has showed that few vulnerability discovery models (VDMs) highlight the characterization of vulnerability discovery process. In these models, severity which is one of the most crucial properties has not been considered. Vulnerabilities can be categorized into different levels based on their severity. The discovery process of each kind of vulnerabilities is different from the other. Hence, it is essential to incorporate the severity of the vulnerabilities during the modelling of the vulnerability discovery process. This paper proposes a model to assess the vulnerabilities present in the software quantitatively with consideration for the severity of the vulnerabilities. It is possible to apply the proposed model to approximate the number of vulnerabilities along with vulnerability discovery rate, future occurrence of vulnerabilities, risk analysis, etc. Vulnerability data obtained from one of the major web browsers (Google Chrome) is deployed to examine goodness-of-fit and predictive capability of the proposed model. Experimental results justify the fact that the model proposed herein can estimate the required information better than the existing VDMs.

2020-09-04

Khan, Aasher, Rehman, Suriya, Khan, Muhammad U.S, Ali, Mazhar. 2019. Synonym-based Attack to Confuse Machine Learning Classifiers Using Black-box Setting. 2019 4th International Conference on Emerging Trends in Engineering, Sciences and Technology (ICEEST). :1—7.

Twitter being the most popular content sharing platform is giving rise to automated accounts called “bots”. Majority of the users on Twitter are bots. Various machine learning (ML) algorithms are designed to detect bots avoiding the vulnerability constraints of ML-based models. This paper contributes to exploit vulnerabilities of machine learning (ML) algorithms through black-box attack. An adversarial text sequence misclassifies the results of deep learning (DL) classifiers for bot detection. Literature shows that ML models are vulnerable to attacks. The aim of this paper is to compromise the accuracy of ML-based bot detection algorithms by replacing original words in tweets with their synonyms. Our results show 7.2% decrease in the accuracy for bot tweets, therefore classifying bot tweets as legitimate tweets.

2020-08-28

Hasanin, Tawfiq, Khoshgoftaar, Taghi M., Leevy, Joffrey L.. 2019. A Comparison of Performance Metrics with Severely Imbalanced Network Security Big Data. 2019 IEEE 20th International Conference on Information Reuse and Integration for Data Science (IRI). :83—88.

Severe class imbalance between the majority and minority classes in large datasets can prejudice Machine Learning classifiers toward the majority class. Our work uniquely consolidates two case studies, each utilizing three learners implemented within an Apache Spark framework, six sampling methods, and five sampling distribution ratios to analyze the effect of severe class imbalance on big data analytics. We use three performance metrics to evaluate this study: Area Under the Receiver Operating Characteristic Curve, Area Under the Precision-Recall Curve, and Geometric Mean. In the first case study, models were trained on one dataset (POST) and tested on another (SlowlorisBig). In the second case study, the training and testing dataset roles were switched. Our comparison of performance metrics shows that Area Under the Precision-Recall Curve and Geometric Mean are sensitive to changes in the sampling distribution ratio, whereas Area Under the Receiver Operating Characteristic Curve is relatively unaffected. In addition, we demonstrate that when comparing sampling methods, borderline-SMOTE2 outperforms the other methods in the first case study, and Random Undersampling is the top performer in the second case study.

2020-08-17

Hu, Jianxing, Huo, Dongdong, Wang, Meilin, Wang, Yazhe, Zhang, Yan, Li, Yu. 2019. A Probability Prediction Based Mutable Control-Flow Attestation Scheme on Embedded Platforms. 2019 18th IEEE International Conference On Trust, Security And Privacy In Computing And Communications/13th IEEE International Conference On Big Data Science And Engineering (TrustCom/BigDataSE). :530–537.

Control-flow attacks cause powerful threats to the software integrity. Remote attestation for control flow is a crucial security service for ensuring the software integrity on embedded platforms. The fine-grained remote control-flow attestation with execution-profiling Control-Flow Graph (CFG) is applied to defend against control-flow attacks. It is a safe scheme but it may influence the runtime efficiency. In fact, we find out only the vulnerable parts of a program need being attested at costly fine-grained level to ensure the security, and the remaining normal parts just need a lightweight coarse-grained check to reduce the overhead. We propose Mutable Granularity Control-Flow Attestation (MGC-FA) scheme, which bases on a probabilistic model, to distinguish between the vulnerable and normal parts in the program and combine fine-grained and coarse-grained control-flow attestation schemes. MGC-FA employs the execution-profiling CFG to apply the remote control-flow attestation scheme on embedded devices. MGC-FA is implemented on Raspberry Pi with ARM TrustZone and the experimental results show its effect on balancing the relationship between runtime efficiency and control-flow security.

2020-08-13

Augusto, Cristian, Morán, Jesús, De La Riva, Claudio, Tuya, Javier. 2019. Test-Driven Anonymization for Artificial Intelligence. 2019 IEEE International Conference On Artificial Intelligence Testing (AITest). :103—110.

In recent years, data published and shared with third parties to develop artificial intelligence (AI) tools and services has significantly increased. When there are regulatory or internal requirements regarding privacy of data, anonymization techniques are used to maintain privacy by transforming the data. The side-effect is that the anonymization may lead to useless data to train and test the AI because it is highly dependent on the quality of the data. To overcome this problem, we propose a test-driven anonymization approach for artificial intelligence tools. The approach tests different anonymization efforts to achieve a trade-off in terms of privacy (non-functional quality) and functional suitability of the artificial intelligence technique (functional quality). The approach has been validated by means of two real-life datasets in the domains of healthcare and health insurance. Each of these datasets is anonymized with several privacy protections and then used to train classification AIs. The results show how we can anonymize the data to achieve an adequate functional suitability in the AI context while maintaining the privacy of the anonymized data as high as possible.

2020-08-03

Juuti, Mika, Szyller, Sebastian, Marchal, Samuel, Asokan, N.. 2019. PRADA: Protecting Against DNN Model Stealing Attacks. 2019 IEEE European Symposium on Security and Privacy (EuroS P). :512–527.

Machine learning (ML) applications are increasingly prevalent. Protecting the confidentiality of ML models becomes paramount for two reasons: (a) a model can be a business advantage to its owner, and (b) an adversary may use a stolen model to find transferable adversarial examples that can evade classification by the original model. Access to the model can be restricted to be only via well-defined prediction APIs. Nevertheless, prediction APIs still provide enough information to allow an adversary to mount model extraction attacks by sending repeated queries via the prediction API. In this paper, we describe new model extraction attacks using novel approaches for generating synthetic queries, and optimizing training hyperparameters. Our attacks outperform state-of-the-art model extraction in terms of transferability of both targeted and non-targeted adversarial examples (up to +29-44 percentage points, pp), and prediction accuracy (up to +46 pp) on two datasets. We provide take-aways on how to perform effective model extraction attacks. We then propose PRADA, the first step towards generic and effective detection of DNN model extraction attacks. It analyzes the distribution of consecutive API queries and raises an alarm when this distribution deviates from benign behavior. We show that PRADA can detect all prior model extraction attacks with no false positives.