Biblio

Found 5882 results

Filters: Keyword is composability  [Clear All Filters]
2019-01-21
Ahmed, Chuadhry Mujeeb, Zhou, Jianying, Mathur, Aditya P..  2018.  Noise Matters: Using Sensor and Process Noise Fingerprint to Detect Stealthy Cyber Attacks and Authenticate Sensors in CPS. Proceedings of the 34th Annual Computer Security Applications Conference. :566–581.
A novel scheme is proposed to authenticate sensors and detect data integrity attacks in a Cyber Physical System (CPS). The proposed technique uses the hardware characteristics of a sensor and physics of a process to create unique patterns (herein termed as fingerprints) for each sensor. The sensor fingerprint is a function of sensor and process noise embedded in sensor measurements. Uniqueness in the noise appears due to manufacturing imperfections of a sensor and due to unique features of a physical process. To create a sensor's fingerprint a system-model based approach is used. A noise-based fingerprint is created during the normal operation of the system. It is shown that under data injection attacks on sensors, noise pattern deviations from the fingerprinted pattern enable the proposed scheme to detect attacks. Experiments are performed on a dataset from a real-world water treatment (SWaT) facility. A class of stealthy attacks is designed against the proposed scheme and extensive security analysis is carried out. Results show that a range of sensors can be uniquely identified with an accuracy as high as 98%. Extensive sensor identification experiments are carried out on a set of sensors in SWaT testbed. The proposed scheme is tested on a variety of attack scenarios from the reference literature which are detected with high accuracy
2019-02-14
Eclarin, Bobby A., Fajardo, Arnel C., Medina, Ruji P..  2018.  A Novel Feature Hashing With Efficient Collision Resolution for Bag-of-Words Representation of Text Data. Proceedings of the 2Nd International Conference on Natural Language Processing and Information Retrieval. :12-16.
Text Mining is widely used in many areas transforming unstructured text data from all sources such as patients' record, social media network, insurance data, and news, among others into an invaluable source of information. The Bag Of Words (BoW) representation is a means of extracting features from text data for use in modeling. In text classification, a word in a document is assigned a weight according to its frequency and frequency between different documents; therefore, words together with their weights form the BoW. One way to solve the issue of voluminous data is to use the feature hashing method or hashing trick. However, collision is inevitable and might change the result of the whole process of feature generation and selection. Using the vector data structure, the lookup performance is improved while resolving collision and the memory usage is also efficient.
2020-11-09
Mobaraki, S., Amirkhani, A., Atani, R. E..  2018.  A Novel PUF based Logic Encryption Technique to Prevent SAT Attacks and Trojan Insertion. 2018 9th International Symposium on Telecommunications (IST). :507–513.
The manufacturing of integrated circuits (IC) outside of the design houses makes it possible for the adversary to easily perform a reverse engineering attack against intellectual property (IP)/IC. The aim of this attack can be the IP piracy, overproduction, counterfeiting or inserting hardware Trojan (HT) throughout the supply chain of the IC. Preventing hardware Trojan insertion is a significant issue in the context of hardware security (HS) and has not been considered in most of the previous logic encryption methods. To eliminate this problem, in this paper an Anti-Trojan insertion algorithm is presented. The idea is based on the fact that reducing the signals with low-observability (LO) and low-controllability (LC) can prevent HT insertion significantly. The security of logic encryption methods depends on the algorithm and the encryption key. However, the security of these methods has been compromised by SAT attacks over recent years. SAT attacks, can decode the correct key from most logic encryption techniques. In this article, by using the PUF-based encryption, the applied key in the encryption is randomized and SAT attack cannot be performed. Based on the output of PUF, a unique encryption has been made for each chip that preventing from counterfeiting and IP piracy.
2019-10-07
Monge, Marco Antonio Sotelo, Vidal, Jorge Maestre, Villalba, Luis Javier García.  2018.  A Novel Self-Organizing Network Solution Towards Crypto-ransomware Mitigation. Proceedings of the 13th International Conference on Availability, Reliability and Security. :48:1–48:10.
In the last decade, crypto-ransomware evolved from a family of malicious software with scarce repercussion in the research community, to a sophisticated and highly effective intrusion method positioned in the spotlight of the main organizations for cyberdefense. Its modus operandi is characterized by fetching the assets to be blocked, their encryption, and triggering an extortion process that leads the victim to pay for the key that allows their recovery. This paper reviews the evolution of crypto-ransomware focusing on the implication of the different advances in communication technologies that empowered its popularization. In addition, a novel defensive approach based on the Self-Organizing Network paradigm and the emergent communication technologies (e.g. Software-Defined Networking, Network Function Virtualization, Cloud Computing, etc.) is proposed. They enhance the orchestration of smart defensive deployments that adapt to the status of the monitoring environment and facilitate the adoption of previously defined risk management policies. In this way it is possible to efficiently coordinate the efforts of sensors and actuators distributed throughout the protected environment without supervision by human operators, resulting in greater protection with increased viability
2019-02-18
Zhu, Mengeheng, Shi, Hong.  2018.  A Novel Support Vector Machine Algorithm for Missing Data. Proceedings of the 2Nd International Conference on Innovation in Artificial Intelligence. :48–53.
Missing data problem often occurs in data analysis. The most common way to solve this problem is imputation. But imputation methods are only suitable for dealing with a low proportion of missing data, when assuming that missing data satisfies MCAR (Missing Completely at Random) or MAR (Missing at Random). In this paper, considering the reasons for missing data, we propose a novel support vector machine method using a new kernel function to solve the problem with a relatively large proportion of missing data. This method makes full use of observed data to reduce the error caused by filling a large number of missing values. We validate our method on 4 data sets from UCI Repository of Machine Learning. The accuracy, F-score, Kappa statistics and recall are used to evaluate the performance. Experimental results show that our method achieve significant improvement in terms of classification results compared with common imputation methods, even when the proportion of missing data is high.
2019-02-08
Isaacson, D. M..  2018.  The ODNI-OUSD(I) Xpress Challenge: An Experimental Application of Artificial Intelligence Techniques to National Security Decision Support. 2018 IEEE 8th Annual Computing and Communication Workshop and Conference (CCWC). :104-109.
Current methods for producing and disseminating analytic products contribute to the latency of relaying actionable information and analysis to the U.S. Intelligence Community's (IC's) principal customers, U.S. policymakers and warfighters. To circumvent these methods, which can often serve as a bottleneck, we report on the results of a public prize challenge that explored the potential for artificial intelligence techniques to generate useful analytic products. The challenge tasked solvers to develop algorithms capable of searching and processing nearly 15,000 unstructured text files into a 1-2 page analytic product without human intervention; these analytic products were subsequently evaluated and scored using established IC methodologies and criteria. Experimental results from this challenge demonstrate the promise for the ma-chine generation of analytic products to ensure that the IC warns and informs in a more timely fashion.
2019-05-01
Gautier, Adam M., Andel, Todd R., Benton, Ryan.  2018.  On-Device Detection via Anomalous Environmental Factors. Proceedings of the 8th Software Security, Protection, and Reverse Engineering Workshop. :5:1–5:8.
Embedded Systems (ES) underlie society's critical cyberinfrastructure and comprise the vast majority of consumer electronics, making them a prized target for dangerous malware and hardware Trojans. Malicious intrusion into these systems present a threat to national security and economic stability as globalized supply chains and tight network integration make ES more susceptible to attack than ever. High-end ES like the Xilinx Zynq-7020 system on a chip are widely used in the field and provide a representative platform for investigating the methods of cybercriminals. This research suggests a novel anomaly detection framework that could be used to detect potential zero-day exploits, undiscovered rootkits, or even maliciously implanted hardware by leveraging the Zynq architecture and real-time device-level measurements of thermal side-channels. The results of an initial investigation showed different processor workloads produce distinct thermal fingerprints that are detectable by out-of-band, digital logic-based thermal sensors.
2019-08-05
Hu, Xinyi, Zhao, Yaqun.  2018.  One to One Identification of Cryptosystem Using Fisher's Discriminant Analysis. Proceedings of the 6th ACM/ACIS International Conference on Applied Computing and Information Technology. :7–12.
Distinguishing analysis is an important part of cryptanalysis. It is an important content of discriminating analysis that how to identify ciphertext is encrypted by which cryptosystems when it knows only ciphertext. In this paper, Fisher's discriminant analysis (FDA), which is based on statistical method and machine learning, is used to identify 4 stream ciphers and 7 block ciphers one to one by extracting 9 different features. The results show that the accuracy rate of the FDA can reach 80% when identifying files that are encrypted by the stream cipher and the block cipher in ECB mode respectively, and files encrypted by the block cipher in ECB mode and CBC mode respectively. The average one to one identification accuracy rates of stream ciphers RC4, Grain, Sosemanuk are more than 55%. The maximum accuracy rate can reach 60% when identifying SMS4 from block ciphers in CBC mode one to one. The identification accuracy rate of entropy-based features is apparently higher than the probability-based features.
2020-01-06
Huang, Zhiyi, Liu, Jinyan.  2018.  Optimal Differentially Private Algorithms for k-Means Clustering. Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. :395–408.
We consider privacy-preserving k-means clustering. For the objective of minimizing the Wasserstein distance between the output and the optimal solution, we show that there is a polynomial-time (ε,δ)-differentially private algorithm which, for any sufficiently large Φ2 well-separated datasets, outputs k centers that are within Wasserstein distance Ø(Φ2) from the optimal. This result improves the previous bounds by removing the dependence on ε, number of centers k, and dimension d. Further, we prove a matching lower bound that no (ε, δ)-differentially private algorithm can guarantee Wasserstein distance less than Ømega (Φ2) and, thus, our positive result is optimal up to a constant factor. For minimizing the k-means objective when the dimension d is bounded, we propose a polynomial-time private local search algorithm that outputs an αn-additive approximation when the size of the dataset is at least \textbackslashtextasciitildeØ (k3/2 · d · ε-1 · poly(α-1)).
2019-02-14
Deng, Dong, Tao, Yufei, Li, Guoliang.  2018.  Overlap Set Similarity Joins with Theoretical Guarantees. Proceedings of the 2018 International Conference on Management of Data. :905-920.
This paper studies the set similarity join problem with overlap constraints which, given two collections of sets and a constant c, finds all the set pairs in the datasets that share at least c common elements. This is a fundamental operation in many fields, such as information retrieval, data mining, and machine learning. The time complexity of all existing methods is O(n2) where n is the total size of all the sets. In this paper, we present a size-aware algorithm with the time complexity of O(n2-over 1 c k1 over 2c)=o(n2)+O(k), where k is the number of results. The size-aware algorithm divides all the sets into small and large ones based on their sizes and processes them separately. We can use existing methods to process the large sets and focus on the small sets in this paper. We develop several optimization heuristics for the small sets to improve the practical performance significantly. As the size boundary between the small sets and the large sets is crucial to the efficiency, we propose an effective size boundary selection algorithm to judiciously choose an appropriate size boundary, which works very well in practice. Experimental results on real-world datasets show that our methods achieve high performance and outperform the state-of-the-art approaches by up to an order of magnitude.
2019-02-21
Vaishnav, J., Uday, A. B., Poulose, T..  2018.  Pattern Formation in Swarm Robotic Systems. 2018 2nd International Conference on Trends in Electronics and Informatics (ICOEI). :1466–1469.
Swarm robotics, a combination of Swarm intelligence and robotics, is inspired from how the nature swarms, such as flock of birds, swarm of bees, ants, fishes etc. These group behaviours show great flexibility and robustness which enable the robots to perform various tasks like pattern formation, rescue and military operation, space expedition etc. This paper discusses an algorithm for forming patterns, which are English alphabets, by identical robots, in a finite amount of time and also analyses outcome of the algorithm. In order to implement the algorithm, 9 identical circular robots of diameter 15 cm are used, each having a Node MCU module and a rotary encoder attached to one wheel of the robot. The robots are initially placed at the centres of an imaginary 3×3 grid, on a white sheet of paper, of dimensions 250cm × 250 cm. All the robots are connected to the laptop's network via wifi and data send from the laptop is received by the Node MCU modules. This data includes the distance to be moved and the angle to be turned by each robot in order to form the letter. The rotary encoders enable the robot to move specific distances and turn specific angles, with high accuracy, by real time feedback. The algorithm is written in Python and image processing is done using OpenCV. Certain approximations are used in order to implement collision avoidance. Finally after calibration, the word given as input, is formed letter by letter, using these 9 identical robots.
2019-09-23
Ramijak, Dusan, Pal, Amitangshu, Kant, Krishna.  2018.  Pattern Mining Based Compression of IoT Data. Proceedings of the Workshop Program of the 19th International Conference on Distributed Computing and Networking. :12:1–12:6.
The increasing proliferation of the Internet of Things (IoT) devices and systems result in large amounts of highly heterogeneous data to be collected. Although at least some of the collected sensor data is often consumed by the real-time decision making and control of the IoT system, that is not the only use of such data. Invariably, the collected data is stored, perhaps in some filtered or downselected fashion, so that it can be used for a variety of lower-frequency operations. It is expected that in a smart city environment with numerous IoT deployments, the volume of such data can become enormous. Therefore, mechanisms for lossy data compression that provide a trade-off between compression ratio and data usefulness for offline statistical analysis becomes necessary. In this paper, we discuss several simple pattern mining based compression strategies for multi-attribute IoT data streams. For each method, we evaluate the compressibility of the method vs. the level of similarity between original and compressed time series in the context of the home energy management system.
2019-02-18
Afsharinejad, Armita, Hurley, Neil.  2018.  Performance Analysis of a Privacy Constrained kNN Recommendation Using Data Sketches. Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. :10–18.
This paper evaluates two algorithms, BLIP and JLT, for creating differentially private data sketches of user profiles, in terms of their ability to protect a kNN collaborative filtering algorithm from an inference attack by third-parties. The transformed user profiles are employed in a user-based top-N collaborative filtering system. For the first time, a theoretical analysis of the BLIP is carried out, to derive expressions that relate its parameters to its performance. This allows the two techniques to be fairly compared. The impact of deploying these approaches on the utility of the system—its ability to make good recommendations, and on its privacy level—the ability of third-parties to make inferences about the underlying user preferences, is examined. An active inference attack is evaluated, that consists of the injection of a number of tailored sybil profiles into the system database. User profile data of targeted users is then inferred from the recommendations made to the sybils. Although the differentially private sketches are designed to allow the transformed user profiles to be published without compromising privacy, the attack we examine does not use such information and depends only on some pre-existing knowledge of some user preferences as well as the neighbourhood size of the kNN algorithm. Our analysis therefore assesses in practical terms a relatively weak privacy attack, which is extremely simple to apply in systems that allow low-cost generation of sybils. We find that, for a given differential privacy level, the BLIP injects less noise into the system, but for a given level of noise, the JLT offers a more compact representation.
2019-10-15
Vyakaranal, S., Kengond, S..  2018.  Performance Analysis of Symmetric Key Cryptographic Algorithms. 2018 International Conference on Communication and Signal Processing (ICCSP). :0411–0415.
Data's security being important aspect of the today's internet is gaining more importance day by day. With the increase in online data exchange, transactions and payments; secure payment and secure data transfers have become an area of concern. Cryptography makes the data transmission over the internet secure by various methods, algorithms. Cryptography helps in avoiding the unauthorized people accessing the data by authentication, confidentiality, integrity and non-repudiation. In order to securely transmit the data many cryptographic algorithms are present, but the algorithm to be used should be robust, efficient, cost effective, high performance and easily deployable. Choosing an algorithm which suits the customer's requirement is an utmost important task. The proposed work discusses different symmetric key cryptographic algorithms like DES, 3DES, AES and Blowfish by considering encryption time, decryption time, entropy, memory usage, throughput, avalanche effect and energy consumption by practical implementation using java. Practical implementation of algorithms has been highlighted in proposed work considering tradeoff performance in terms of cost of various parameters rather than mere theoretical concepts. Battery consumption and avalanche effect of algorithms has been discussed. It reveals that AES performs very well in overall performance analysis among considered algorithms.
2019-05-01
Georgiadis, Ioannis, Dossis, Michael, Kontogiannis, Sotirios.  2018.  Performance Evaluation on IoT Devices Secure Data Delivery Processes. Proceedings of the 22Nd Pan-Hellenic Conference on Informatics. :306–311.
This paper presents existing cryptographic technologies used by the IoT industry. Authors review security capabilities of existing IoT protocols such as LoRaWAN, IEE802.15.4, BLE and RF based. Authors also experiment with the cryptographic efficiency and energy consumption of existing cryptography algorithms, implemented on embedded systems. Authors evaluate the performance of 32bit single ARM cortex microprocessor, Atmel ATmega32u4 8-bit micro-controller and Parallella Xillix Zynq FPGA parallel co-processors. From the experimental results, authors signify the requirements of the next generation IoT security protocols and from their experimental results provide useful guidelines.
Arefi, Meisam Navaki, Alexander, Geoffrey, Crandall, Jedidiah R..  2018.  PIITracker: Automatic Tracking of Personally Identifiable Information in Windows. Proceedings of the 11th European Workshop on Systems Security. :3:1–3:6.
Personally Identifiable Information (PII) is information that can be used on its own or with other information to distinguish or trace an individual's identity. To investigate an application for PII tracking, a reverse engineer has to put considerable effort to reverse engineer an application and discover what an application does with PII. To automate this process and save reverse engineers substantial time and effort, we propose PIITracker which is a new and novel tool that can track PII automatically and capture if any processes are sending PII over the network. This is made possible by 1) whole-system dynamic information flow tracking 2) monitoring specific function and system calls. We analyzed 15 popular chat applications and browsers using PIITracker, and determined that 12 of these applications collect some form of PII.
2019-08-05
Akkermans, Sven, Crispo, Bruno, Joosen, Wouter, Hughes, Danny.  2018.  Polyglot CerberOS: Resource Security, Interoperability and Multi-Tenancy for IoT Services on a Multilingual Platform. Proceedings of the 15th EAI International Conference on Mobile and Ubiquitous Systems: Computing, Networking and Services. :59–68.
The Internet of Things (IoT) promises to tackle a range of environmental challenges and deliver large efficiency gains in industry by embedding computational intelligence, sensing and control in our physical environment. Multiple independent parties are increasingly seeking to leverage shared IoT infrastructure, using a similar model to the cloud, and thus require constrained IoT devices to become microservice-hosting platforms that can securely and concurrently execute their code and interoperate. This vision demands that heterogeneous services, peripherals and platforms are provided with an expanded set of security guarantees to prevent third-party services from hijacking the platform, resource-level access control and accounting, and strong isolation between running processes to prevent unauthorized access to third-party services and data. This paper introduces Polyglot CerberOS, a resource-secure operating system for multi-tenant IoT devices that is realised through a reconfigurable virtual machine which can simultaneously execute interoperable services, written in different languages. We evaluate Polyglot CerberOS on IETF Class-1 devices running both Java and C services. The results show that interoperability and strong security guarantees for multilingual services on multi-tenant commodity IoT devices are feasible, in terms of performance and memory overhead, and transparent for developers.
2019-11-25
Vasilopoulos, Dimitrios, Elkhiyaoui, Kaoutar, Molva, Refik, Önen, Melek.  2018.  POROS: Proof of Data Reliability for Outsourced Storage. Proceedings of the 6th International Workshop on Security in Cloud Computing. :27–37.
We introduce POROS that is a new solution for proof of data reliability. In addition to the integrity of the data outsourced to a cloud storage system, proof of data reliability assures the customers that the cloud storage provider (CSP) has provisioned sufficient amounts of redundant information along with original data segments to be able to guarantee the maintenance of the data in the face of corruption. In spite of meeting a basic service requirement, the placement of the data repair capability at the CSP raises a challenging issue with respect to the design of a proof of data reliability scheme. Existing schemes like Proof of Data Possession (PDP) and Proof of Retrievability (PoR) fall short of providing proof of data reliability to customers, since those schemes are not designed to audit the redundancy mechanisms of the CSP. Thus, in addition to verifying the possession of the original data segments, a proof of data reliability scheme must also assure that sufficient redundancy information is kept at storage. Thanks to some combination of PDP with time constrained operations, POROS guarantees that a rationale CSP would not compute redundancy information on demand upon proof of data reliability requests but instead would store it at rest. As a result of bestowing the CSP with the repair function, POROS allows for the automatic maintenance of data by the storage provider without any interaction with the customers.
2019-01-16
Kwon, HyukSang, Raza, Shahid, Ko, JeongGil.  2018.  POSTER: On Compressing PKI Certificates for Resource Limited Internet of Things Devices. Proceedings of the 2018 on Asia Conference on Computer and Communications Security. :837–839.
Certificate-based Public Key Infrastructure (PKI) schemes are used to authenticate the identity of distinct nodes on the Internet. Using certificates for the Internet of Things (IoT) can allow many privacy sensitive applications to be trusted over the larger Internet architecture. However, since IoT devices are typically resource limited, full sized PKI certificates are not suitable for use in the IoT domain. This work outlines our approach in compressing standards-compliant X.509 certificates so that their sizes are reduced and can be effectively used on IoT nodes. Our scheme combines the use of Concise Binary Object Representation (CBOR) and also a scheme that compresses all data that can be implicitly inferenced within the IoT sub-network. Our scheme shows a certificate compression rate of up to \textbackslashtextasciitilde30%, which allows effective energy reduction when using X.509-based certificates on IoT platforms.
2019-12-18
2019-02-13
Fawaz, A. M., Noureddine, M. A., Sanders, W. H..  2018.  POWERALERT: Integrity Checking Using Power Measurement and a Game-Theoretic Strategy. 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN). :514–525.
We propose POWERALERT, an efficient external integrity checker for untrusted hosts. Current attestation systems suffer from shortcomings, including requiring a complete checksum of the code segment, from being static, use of timing information sourced from the untrusted machine, or using imprecise timing information such as network round-trip time. We address those shortcomings by (1) using power measurements from the host to ensure that the checking code is executed and (2) checking a subset of the kernel space over an extended period. We compare the power measurement against a learned power model of the execution of the machine and validate that the execution was not tampered. Finally, POWERALERT randomizes the integrity checking program to prevent the attacker from adapting. We model the interaction between POWERALERT and an attacker as a time-continuous game. The Nash equilibrium strategy of the game shows that POWERALERT has two optimal strategy choices: (1) aggressive checking that forces the attacker into hiding, or (2) slow checking that minimizes cost. We implement a prototype of POWERALERT using Raspberry Pi and evaluate the performance of the integrity checking program generation.
2019-11-25
Hahn, Florian, Loza, Nicolas, Kerschbaum, Florian.  2018.  Practical and Secure Substring Search. Proceedings of the 2018 International Conference on Management of Data. :163–176.
In this paper we address the problem of outsourcing sensitive strings while still providing the functionality of substring searches. While security is one important aspect that requires careful system design, the practical application of the solution depends on feasible processing time and integration efforts into existing systems. That is, searchable symmetric encryption (SSE) allows queries on encrypted data but makes common indexing techniques used in database management systems for fast query processing impossible. As a result, the overhead for deploying such functional and secure encryption schemes into database systems while maintaining acceptable processing time requires carefully designed special purpose index structures. Such structures are not available on common database systems but require individual modifications depending on the deployed SSE scheme. Our technique transforms the problem of secure substring search into range queries that can be answered efficiently and in a privacy-preserving way on common database systems without further modifications using frequency-hiding order-preserving encryption. We evaluated our prototype implementation deployed in a real-world scenario, including the consideration of network latency, we demonstrate the practicability of our scheme with 98.3 ms search time for 10,000 indexed emails. Further, we provide a practical security evaluation of this transformation based on the bucketing attack that is the best known published attack against this kind of property-preserving encryption.
2019-01-16
Sharif, Mahmood, Urakawa, Jumpei, Christin, Nicolas, Kubota, Ayumu, Yamada, Akira.  2018.  Predicting Impending Exposure to Malicious Content from User Behavior. Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security. :1487–1501.
Many computer-security defenses are reactive—they operate only when security incidents take place, or immediately thereafter. Recent efforts have attempted to predict security incidents before they occur, to enable defenders to proactively protect their devices and networks. These efforts have primarily focused on long-term predictions. We propose a system that enables proactive defenses at the level of a single browsing session. By observing user behavior, it can predict whether they will be exposed to malicious content on the web seconds before the moment of exposure, thus opening a window of opportunity for proactive defenses. We evaluate our system using three months' worth of HTTP traffic generated by 20,645 users of a large cellular provider in 2017 and show that it can be helpful, even when only very low false positive rates are acceptable, and despite the difficulty of making "on-the-fly” predictions. We also engage directly with the users through surveys asking them demographic and security-related questions, to evaluate the utility of self-reported data for predicting exposure to malicious content. We find that self-reported data can help forecast exposure risk over long periods of time. However, even on the long-term, self-reported data is not as crucial as behavioral measurements to accurately predict exposure.
2019-02-18
Xu, Bowen, Shirani, Amirreza, Lo, David, Alipour, Mohammad Amin.  2018.  Prediction of Relatedness in Stack Overflow: Deep Learning vs. SVM: A Reproducibility Study. Proceedings of the 12th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. :21:1–21:10.
Background Xu et al. used a deep neural network (DNN) technique to classify the degree of relatedness between two knowledge units (question-answer threads) on Stack Overflow. More recently, extending Xu et al.'s work, Fu and Menzies proposed a simpler classification technique based on a fine-tuned support vector machine (SVM) that achieves similar performance but in a much shorter time. Thus, they suggested that researchers need to compare their sophisticated methods against simpler alternatives. Aim The aim of this work is to replicate the previous studies and further investigate the validity of Fu and Menzies' claim by evaluating the DNN- and SVM-based approaches on a larger dataset. We also compare the effectiveness of these two approaches against SimBow, a lightweight SVM-based method that was previously used for general community question-answering. Method We (1) collect a large dataset containing knowledge units from Stack Overflow, (2) show the value of the new dataset addressing shortcomings of the original one, (3) re-evaluate both the DNN-and SVM-based approaches on the new dataset, and (4) compare the performance of the two approaches against that of SimBow. Results We find that: (1) there are several limitations in the original dataset used in the previous studies, (2) effectiveness of both Xu et al.'s and Fu and Menzies' approaches (as measured using F1-score) drop sharply on the new dataset, (3) similar to the previous finding, performance of SVM-based approaches (Fu and Menzies' approach and SimBow) are slightly better than the DNN-based approach, (4) contrary to the previous findings, Fu and Menzies' approach runs much slower than DNN-based approach on the larger dataset - its runtime grows sharply with increase in dataset size, and (5) SimBow outperforms both Xu et al. and Fu and Menzies' approaches in terms of runtime. Conclusion We conclude that, for this task, simpler approaches based on SVM performs adequately well. We also illustrate the challenges brought by the increased size of the dataset and show the benefit of a lightweight SVM-based approach for this task.
2019-12-09
Robert, Henzel, Georg, Herzwurm.  2018.  A preliminary approach towards the trust issue in cloud manufacturing using grounded theory: Defining the problem domain. 2018 4th International Conference on Universal Village (UV). :1–6.
In Cloud Manufacturing trust is an important, under investigated issue. This paper proceeds the noncommittal phase of the grounded theory method approach by investigating the trust topic in several research streams, defining the problem domain. This novel approach fills a research gap and can be treated as a snapshot and blueprint of research. Findings were accomplished by a structured literature review and are able to help future researchers in pursuing the integrative phase in Grounded Theory by building on the preliminary result of this paper.