Biblio
Nowadays, honeypots are a key tool to attract attackers and study their activity. They help us in the tasks of evaluating attacker's behaviour, discovering new types of attacks, and collecting information and statistics associated with them. However, the gathered data cannot be directly interpreted, but must be analyzed to obtain useful information. In this paper, we present a SSH honeypot-based system designed to simulate a vulnerable server. Thus, we propose an approach for the classification of metrics from the data collected by the honeypot along 19 months.
For sharing resources using ad hoc communication MANET are quite effective and scalable medium. MANET is a distributed, decentralized, dynamic network with no fixed infrastructure, which are self- organized and self-managed. Achieving high security level is a major challenge in case of MANET. Layered architecture is one of the ways for handling security challenges, which enables collection and analysis of data from different security dimensions. This work proposes a novel multi-layered outlier detection algorithm using hierarchical similarity metric with hierarchical categorized data. Network performance with and without the presence of outlier is evaluated for different quality-of-service parameters like percentage of APDR and AT for small (100 to 200 nodes), medium (200 to 1000 nodes) and large (1000 to 3000 nodes) scale networks. For a network with and without outliers minimum improvements observed are 9.1 % and 0.61 % for APDR and AT respectively while the maximum improvements of 22.1 % and 104.1 %.
Smartphones have become ubiquitous in our everyday lives, providing diverse functionalities via millions of applications (apps) that are readily available. To achieve these functionalities, apps need to access and utilize potentially sensitive data, stored in the user's device. This can pose a serious threat to users' security and privacy, when considering malicious or underskilled developers. While application marketplaces, like Google Play store and Apple App store, provide factors like ratings, user reviews, and number of downloads to distinguish benign from risky apps, studies have shown that these metrics are not adequately effective. The security and privacy health of an application should also be considered to generate a more reliable and transparent trustworthiness score. In order to automate the trustworthiness assessment of mobile applications, we introduce the Trust4App framework, which not only considers the publicly available factors mentioned above, but also takes into account the Security and Privacy (S&P) health of an application. Additionally, it considers the S&P posture of a user, and provides an holistic personalized trustworthiness score. While existing automatic trustworthiness frameworks only consider trustworthiness indicators (e.g. permission usage, privacy leaks) individually, Trust4App is, to the best of our knowledge, the first framework to combine these indicators. We also implement a proof-of-concept realization of our framework and demonstrate that Trust4App provides a more comprehensive, intuitive and actionable trustworthiness assessment compared to existing approaches.
A Cyber Physical Sensor System (CPSS) consists of a computing platform equipped with wireless access points, sensors, and actuators. In a Cyber Physical System, CPSS constantly collects data from a physical object that is under process and performs local real-time control activities based on the process algorithm. The collected data is then transmitted through the network layer to the enterprise command and control center or to the cloud computing services for further processing and analysis. This paper investigates the CPSS' most common cyber security threats and vulnerabilities and provides countermeasures. Furthermore, the paper addresses how the CPSS are attacked, what are the leading consequences of the attacks, and the possible remedies to prevent them. Detailed case studies are presented to help the readers understand the CPSS threats, vulnerabilities, and possible solutions.
The diagnosis of performance issues in cloud environments is a challenging problem, due to the different levels of virtualization, the diversity of applications and their interactions on the same physical host. Moreover, because of privacy, security, ease of deployment and execution overhead, an agent-less method, which limits its data collection to the physical host level, is often the only acceptable solution. In this paper, a precise host-based method, to recover wait state for the processes inside a given Virtual Machine (VM), is proposed. The virtual Process State Detection (vPSD) algorithm computes the state of processes through host kernel tracing. The state of a virtual Process (vProcess) is displayed in an interactive trace viewer (Trace Compass) for further inspection. Our proposed VM trace analysis algorithm has been open-sourced for further enhancements and for the benefit of other developers. Experimental evaluations were conducted using a mix of workload types (CPU, Disk, and Network), with different applications like Hadoop, MySQL, and Apache. vPSD, being based on host hypervisor tracing, brings a lower overhead (around 0.03%) as compared to other approaches.
Community Health Workers (CHWs) have been using Mobile Health Data Collection Systems (MDCSs) for supporting the delivery of primary healthcare and carrying out public health surveys, feeding national-level databases with families' personal data. Such systems are used for public surveillance and to manage sensitive data (i.e., health data), so addressing the privacy issues is crucial for successfully deploying MDCSs. In this paper we present a comprehensive privacy threat analysis for MDCSs, discuss the privacy challenges and provide recommendations that are specially useful to health managers and developers. We ground our analysis on a large-scale MDCS used for primary care (GeoHealth) and a well-known Privacy Impact Assessment (PIA) methodology. The threat analysis is based on a compilation of relevant privacy threats from the literature as well as brain-storming sessions with privacy and security experts. Among the main findings, we observe that existing MDCSs do not employ adequate controls for achieving transparency and interveinability. Thus, threatening fundamental privacy principles regarded as data quality, right to access and right to object. Furthermore, it is noticeable that although there has been significant research to deal with data security issues, the attention with privacy in its multiple dimensions is prominently lacking.
In this paper, a distributed architecture for the implementation of smart city has been proposed to facilitate various smart features like solid waste management, efficient urban mobility and public transport, smart parking, robust IT connectivity, safety and security of citizens and a roadmap for achieving it. How massive volume of IoT data can be analyzed and a layered architecture of IoT is explained. Why data integration is important for analyzing and processing of data collected by the different smart devices like sensors, actuators and RFIDs is discussed. The wireless sensor network can be used to sense the data from various locations but there has to be more to it than stuffing sensors everywhere for everything. Why only the sensor is not sufficient for data collection and how human beings can be used to collect data is explained. There is some communication protocols between the volunteers engaged in collecting data to restrict the sharing of data and ensure that the target area is covered with minimum numbers of volunteers. Every volunteer should cover some predefined area to collect data. Then the proposed architecture model is having one central server to store all data in a centralized server. The data processing and the processing of query being made by the user is taking place in centralized server.
Free text keystroke dynamics is a behavioral biometric that has the strong potential to offer unobtrusive and continuous user authentication. Unfortunately, due to the limited data availability, free text keystroke dynamics have not been tested adequately. Based on a novel large dataset of free text keystrokes from our ongoing data collection using behavior in natural settings, we present the first study to evaluate keystroke dynamics while respecting the temporal order of the data. Specifically, we evaluate the performance of different ways of forming a test sample using sessions, as well as a form of continuous authentication that is based on a sliding window on the keystroke time series. Instead of accumulating a new test sample of keystrokes, we update the previous sample with keystrokes that occur in the immediate past sliding window of n minutes. We evaluate sliding windows of 1 to 5, 10, and 30 minutes. Our best performer using a sliding window of 1 minute, achieves an FAR of 1% and an FRR of 11.5%. Lastly, we evaluate the sensitivity of the keystroke dynamics algorithm to short quick insider attacks that last only several minutes, by artificially injecting different portions of impostor keystrokes into the genuine test samples. For example, the evaluated algorithm is found to be able to detect insider attacks that last 2.5 minutes or longer, with a probability of 98.4%.
To assure cyber security of an enterprise, typically SIEM (Security Information and Event Management) system is in place to normalize security events from different preventive technologies and flag alerts. Analysts in the security operation center (SOC) investigate the alerts to decide if it is truly malicious or not. However, generally the number of alerts is overwhelming with majority of them being false positive and exceeding the SOC's capacity to handle all alerts. Because of this, potential malicious attacks and compromised hosts may be missed. Machine learning is a viable approach to reduce the false positive rate and improve the productivity of SOC analysts. In this paper, we develop a user-centric machine learning framework for the cyber security operation center in real enterprise environment. We discuss the typical data sources in SOC, their work flow, and how to leverage and process these data sets to build an effective machine learning system. The paper is targeted towards two groups of readers. The first group is data scientists or machine learning researchers who do not have cyber security domain knowledge but want to build machine learning systems for security operations center. The second group of audiences are those cyber security practitioners who have deep knowledge and expertise in cyber security, but do not have machine learning experiences and wish to build one by themselves. Throughout the paper, we use the system we built in the Symantec SOC production environment as an example to demonstrate the complete steps from data collection, label creation, feature engineering, machine learning algorithm selection, model performance evaluations, to risk score generation.
The increasing complexity and ubiquity in user connectivity, computing environments, information content, and software, mobile, and web applications transfers the responsibility of privacy management to the individuals. Hence, making it extremely difficult for users to maintain the intelligent and targeted level of privacy protection that they need and desire, while simultaneously maintaining their ability to optimally function. Thus, there is a critical need to develop intelligent, automated, and adaptable privacy management systems that can assist users in managing and protecting their sensitive data in the increasingly complex situations and environments that they find themselves in. This work is a first step in exploring the development of such a system, specifically how user personality traits and other characteristics can be used to help automate determination of user sharing preferences for a variety of user data and situations. The Big-Five personality traits of openness, conscientiousness, extroversion, agreeableness, and neuroticism are examined and used as inputs into several popular machine learning algorithms in order to assess their ability to elicit and predict user privacy preferences. Our results show that the Big-Five personality traits can be used to significantly improve the prediction of user privacy preferences in a number of contexts and situations, and so using machine learning approaches to automate the setting of user privacy preferences has the potential to greatly reduce the burden on users while simultaneously improving the accuracy of their privacy preferences and security.
In a world where traditional notions of privacy are increasingly challenged by the myriad companies that collect and analyze our data, it is important that decision-making entities are held accountable for unfair treatments arising from irresponsible data usage. Unfortunately, a lack of appropriate methodologies and tools means that even identifying unfair or discriminatory effects can be a challenge in practice. We introduce the unwarranted associations (UA) framework, a principled methodology for the discovery of unfair, discriminatory, or offensive user treatment in data-driven applications. The UA framework unifies and rationalizes a number of prior attempts at formalizing algorithmic fairness. It uniquely combines multiple investigative primitives and fairness metrics with broad applicability, granular exploration of unfair treatment in user subgroups, and incorporation of natural notions of utility that may account for observed disparities. We instantiate the UA framework in FairTest, the first comprehensive tool that helps developers check data-driven applications for unfair user treatment. It enables scalable and statistically rigorous investigation of associations between application outcomes (such as prices or premiums) and sensitive user attributes (such as race or gender). Furthermore, FairTest provides debugging capabilities that let programmers rule out potential confounders for observed unfair effects. We report on use of FairTest to investigate and in some cases address disparate impact, offensive labeling, and uneven rates of algorithmic error in four data-driven applications. As examples, our results reveal subtle biases against older populations in the distribution of error in a predictive health application and offensive racial labeling in an image tagger.
Reliable detection of intrusion is the basis of safety in cognitive radio networks (CRNs). So far, few scholars applied intrusion detection systems (IDSs) to combat intrusion against CRNs. In order to improve the performance of intrusion detection in CRNs, a distributed intrusion detection scheme has been proposed. In this paper, a method base on Dempster-Shafer's (D-S) evidence theory to detect intrusion in CRNs is put forward, in which the detection data and credibility of different local IDS Agent is combined by D-S in the cooperative detection center, so that different local detection decisions are taken into consideration in the final decision. The effectiveness of the proposed scheme is verified by simulation, and the results reflect a noticeable performance improvement between the proposed scheme and the traditional method.
The focus of this paper is to propose an integration between Internet of Things (IoT) and Video Surveillance, with the aim to satisfy the requirements of the future needs of Video Surveillance, and to accomplish a better use. IoT is a new technology in the sector of telecommunications. It is a network that contains physical objects, items, and devices, which are embedded with sensors and software, thus enabling the objects, and allowing for their data exchange. Video Surveillance systems collect and exchange the data which has been recorded by sensors and cameras and send it through the network. This paper proposes an innovative topology paradigm which could offer a better use of IoT technology in Video Surveillance systems. Furthermore, the contribution of these technologies provided by Internet of Things features in dealing with the basic types of Video Surveillance technology with the aim to improve their use and to have a better transmission of video data through the network. Additionally, there is a comparison between our proposed topology and relevant proposed topologies focusing on the security issue.
Most of the social media platforms generate a massive amount of raw data that is slow-paced. On the other hand, Internet Relay Chat (IRC) protocol, which has been extensively used by hacker community to discuss and share their knowledge, facilitates fast-paced and real-time text communications. Previous studies of malicious IRC behavior analysis were mostly either offline or batch processing. This results in a long response time for data collection, pre-processing, and threat detection. However, since the threats can use the latest vulnerabilities to exploit systems (e.g. zero-day attack) and which can spread fast using IRC channels. Current IRC channel monitoring techniques cannot provide the required fast detection and alerting. In this paper, we present an alternative approach to overcome this limitation by providing real-time and autonomic threat detection in IRC channels. We demonstrate the capabilities of our approach using as an example the shadow brokers' leak exploit (the exploit leveraged by WannaCry ransomware attack) that was captured and detected by our framework.
Given a model with multiple input parameters, and multiple possible sources for collecting data for those parameters, a data collection strategy is a way of deciding from which sources to sample data, in order to reduce the variance on the output of the model. Cain and Van Moorsel have previously formulated the problem of optimal data collection strategy, when each arameter can be associated with a prior normal distribution, and when sampling is associated with a cost. In this paper, we present ADaCS, a new tool built as an extension of PRISM, which automatically analyses all possible data collection strategies for a model, and selects the optimal one. We illustrate ADaCS on attack trees, which are a structured approach to analyse the impact and the likelihood of success of attacks and defenses on computer and socio-technical systems. Furthermore, we introduce a new strategy exploration heuristic that significantly improves on a brute force approach.
Energy use of buildings represents roughly 40% of the overall energy consumption. Most of the national agendas contain goals related to reducing the energy consumption and carbon footprint. Timely and accurate fault detection and diagnosis (FDD) in building management systems (BMS) have the potential to reduce energy consumption cost by approximately 15-30%. Most of the FDD methods are data-based, meaning that their performance is tightly linked to the quality and availability of relevant data. Based on our experience, faults and relevant events data is very sparse and inadequate, mostly because of the lack of will and incentive for those that would need to keep track of faults. In this paper we introduce the idea of using crowdsourcing to support FDD data collection processes, and illustrate our idea through a mobile application that has been implemented for this purpose. Furthermore, we propose a strategy of how to successfully deploy this building occupants' crowdsourcing application.
A major component of modern vehicles is the infotainment system, which interfaces with its drivers and passengers. Other mobile devices, such as handheld phones and laptops, can relay information to the embedded infotainment system through Bluetooth and vehicle WiFi. The ability to extract information from these systems would help forensic analysts determine the general contents that is stored in an infotainment system. Based off the data that is extracted, this would help determine what stored information is relevant to law enforcement agencies and what information is non-essential when it comes to solving criminal activities relating to the vehicle itself. This would overall solidify the Intelligent Transport System and Vehicular Ad Hoc Network infrastructure in combating crime through the use of vehicle forensics. Additionally, determining the content of these systems will allow forensic analysts to know if they can determine anything about the end-user directly and/or indirectly.
Honey pots and honey nets are popular tools in the area of network security and network forensics. The deployment and usage of these tools are influenced by a number of technical and legal issues, which need to be carefully considered together. In this paper, we outline privacy issues of honey pots and honey nets with respect to technical aspects. The paper discusses the legal framework of privacy, legal ground to data processing, and data collection. The analysis of legal issues is based on EU law and is supported by discussions on privacy and related issues. This paper is one of the first papers which discuss in detail privacy issues of honey pots and honey nets in accordance with EU law.
The Industrial Internet promises to radically change and improve many industry's daily business activities, from simple data collection and processing to context-driven, intelligent and pro-active support of workers' everyday tasks and life. The present paper first provides insight into a typical industrial internet application architecture, then it highlights one fundamental arising contradiction: “Who owns the data is often not capable of analyzing it”. This statement is explained by imaging a visionary data supply chain that would realize some of the Industrial Internet promises. To concretely implement such a system, recent standards published by The Open Group are presented, where we highlight the characteristics that make them suitable for Industrial Internet applications. Finally, we discuss comparable solutions and concludes with new business use cases.
A database is a vast collection of data which helps us to collect, retrieve, organize and manage the data in an efficient and effective manner. Databases are critical assets. They store client details, financial information, personal files, company secrets and other data necessary for business. Today people are depending more on the corporate data for decision making, management of customer service and supply chain management etc. Any loss, corrupted data or unavailability of data may seriously affect its performance. The database security should provide protected access to the contents of a database and should preserve the integrity, availability, consistency, and quality of the data This paper describes the architecture based on placing the Elliptical curve cryptography module inside database management software (DBMS), just above the database cache. Using this method only selected part of the database can be encrypted instead of the whole database. This architecture allows us to achieve very strong data security using ECC and increase performance using cache.
South Africa's lead-users predilections to tinker and innovate mobile banking services is driven by various constructs. Advanced technologies have made mobile banking services easy to use, attractive and beneficial. While this is welcome news to many, there are concerns that when lead-users tinker with these services, information security risks are exacerbated. The aim of this article is to present an insightful understanding of the demand-side predilections of South Africa's lead-users in such contexts. We assimilate the theories of Usage Control, (UCON), the Theory of Technology Acceptance Model (TAM), and the Theory of Perceived Risk (TPP) to explain predilections over technology. We demonstrate that constructs derived from these theories can explain the general demand-side predilection to tinker with mobile banking services. A quantitative approach was used to test this. From a sample of South African banking lead-users operating in Gauteng province of South Africa, data was collected and analysed with the help of a software package. We found unexpectedly that, lead-users predilections to tinker with mobile banking services was inhibited by perceived risk. Moreover, male lead-users were more domineering in the tinkering process than female lead-users. The implication for this is discussed and explained in the main body of work.