Biblio
The purpose of the General Data Protection Regulation (GDPR) is to provide improved privacy protection. If an app controls personal data from users, it needs to be compliant with GDPR. However, GDPR lists general rules rather than exact step-by-step guidelines about how to develop an app that fulfills the requirements. Therefore, there may exist GDPR compliance violations in existing apps, which would pose severe privacy threats to app users. In this paper, we take mobile health applications (mHealth apps) as a peephole to examine the status quo of GDPR compliance in Android apps. We first propose an automated system, named HPDROID, to bridge the semantic gap between the general rules of GDPR and the app implementations by identifying the data practices declared in the app privacy policy and the data relevant behaviors in the app code. Then, based on HPDROID, we detect three kinds of GDPR compliance violations, including the incompleteness of privacy policy, the inconsistency of data collections, and the insecurity of data transmission. We perform an empirical evaluation of 796 mHealth apps. The results reveal that 189 (23.7%) of them do not provide complete privacy policies. Moreover, 59 apps collect sensitive data through different measures, but 46 (77.9%) of them contain at least one inconsistent collection behavior. Even worse, among the 59 apps, only 8 apps try to ensure the transmission security of collected data. However, all of them contain at least one encryption or SSL misuse. Our work exposes severe privacy issues to raise awareness of privacy protection for app users and developers.
Smart industrial control systems (e.g., smart grid, oil and gas systems, transportation systems) are connected to the internet, and have the capability to collect and transmit data; as such, they are part of the IoT. The data collected can be used to improve services; however, there are serious privacy risks. This concern is usually addressed by means of privacy policies, but it is often difficult to understand the scope and consequences of such policies. Better tools to visualise and analyse data collection policies are needed. Graph-based modelling tools have been used to analyse complex systems in other domains. In this paper, we apply this technique to IoT data-collection policy analysis and visualisation. We describe graphical representations of category-based data collection policies and show that a graph-based policy language is a powerful tool not only to specify and visualise the policy, but also to analyse policy properties. We illustrate the approach with a simple example in the context of a chemical plant with a truck monitoring system. We also consider policy administration: we propose a classification of queries to help administrators analyse policies, and we show how the queries can be answered using our technique.
Nowadays Internet services have dramatically changed the way people interact with each other and many of our daily activities are supported by those services. Statistical indicators show that more than half of the world's population uses the Internet generating about 2.5 quintillion bytes of data on daily basis. While such a huge amount of data is useful in a number of fields, such as in medical and transportation systems, it also poses unprecedented threats for user's privacy. This is aggravated by the excessive data collection and user profiling activities of service providers. Yet, regulation require service providers to inform users about their data collection and processing practices. The de facto way of informing users about these practices is through the use of privacy policies. Unfortunately, privacy policies suffer from bad readability and other complexities which make them unusable for the intended purpose. To address this issue, we introduce PrivacyGuide, a privacy policy summarization tool inspired by the European Union (EU) General Data Protection Regulation (GDPR) and based on machine learning and natural language processing techniques. Our results show that PrivacyGuide is able to classify privacy policy content into eleven privacy aspects with a weighted average accuracy of 74% and further shed light on the associated risk level with an accuracy of 90%. This article is summarized in: the morning paper an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer
Many governments organizations in Libya have started transferring traditional government services to e-government. These e-services will benefit a wide range of public. However, deployment of e-government bring many new security issues. Attackers would take advantages of vulnerabilities in these e-services and would conduct cyber attacks that would result in data loss, services interruptions, privacy loss, financial loss, and other significant loss. The number of vulnerabilities in e-services have increase due to the complexity of the e-services system, a lack of secure programming practices, miss-configuration of systems and web applications vulnerabilities, or not staying up-to-date with security patches. Unfortunately, there is a lack of study being done to assess the current security level of Libyan government websites. Therefore, this study aims to assess the current security of 16 Libyan government websites using penetration testing framework. In this assessment, no exploits were committed or tried on the websites. In penetration testing framework (pen test), there are four main phases: Reconnaissance, Scanning, Enumeration, Vulnerability Assessment and, SSL encryption evaluation. The aim of a security assessment is to discover vulnerabilities that could be exploited by attackers. We also conducted a Content Analysis phase for all websites. In this phase, we searched for security and privacy policies implementation information on the government websites. The aim is to determine whether the websites are aware of current accepted standard for security and privacy. From our security assessment results of 16 Libyan government websites, we compared the websites based on the number of vulnerabilities found and the level of security policies. We only found 9 websites with high and medium vulnerabilities. Many of these vulnerabilities are due to outdated software and systems, miss-configuration of systems and not applying the latest security patches. These vulnerabilities could be used by cyber hackers to attack the systems and caused damages to the systems. Also, we found 5 websites didn't implement any SSL encryption for data transactions. Lastly, only 2 websites have published security and privacy policies on their websites. This seems to indicate that these websites were not concerned with current standard in security and privacy. Finally, we classify the 16 websites into 4 safety categories: highly unsafe, unsafe, somewhat unsafe and safe. We found only 1 website with a highly unsafe ranking. Based on our finding, we concluded that the security level of the Libyan government websites are adequate, but can be further improved. However, immediate actions need to be taken to mitigate possible cyber attacks by fixing the vulnerabilities and implementing SSL encryption. Also, the websites need to publish their security and privacy policy so the users could trust their websites.
Requirements analysts can model regulated data practices to identify and reason about risks of noncompliance. If terminology is inconsistent or ambiguous, however, these models and their conclusions will be unreliable. To study this problem, we investigated an approach to automatically construct an information type ontology by identifying information type hyponymy in privacy policies using Tregex patterns. Tregex is a utility to match regular expressions against constituency parse trees, which are hierarchical expressions of natural language clauses, including noun and verb phrases. We discovered the Tregex patterns by applying content analysis to 30 privacy policies from six domains (shopping, telecommunication, social networks, employment, health, and news.) From this dataset, three semantic and four lexical categories of hyponymy emerged based on category completeness and wordorder. Among these, we identified and empirically evaluated 72 Tregex patterns to automate the extraction of hyponyms from privacy policies. The patterns match information type hyponyms with an average precision of 0.72 and recall of 0.74.
This paper argues that standard privacy policy principles are unsuitable for wearable devices, and introduces a proposal to test the role of digital literacy on privacy concerns and behaviors, in an effort to devise modified privacy policies that are appropriate for wearable devices.
Android privacy control is an important but difficult problem to solve. Previously, there was much research effort either focusing on extending the Android permission model with better policies or modifying the Android framework for fine-grained access control. In this work, we take an integral approach by designing and implementing SweetDroid, a calling-context-sensitive privacy policy enforcement framework. SweetDroid combines automated policy generation with automated policy enforcement. The automatically generated policies in SweetDroid are based on the calling contexts of privacy sensitive APIs; hence, SweetDroid is able to tell whether a particular API (e.g., getLastKnownLocation) under a certain execution path is leaking private information. The policy enforcement in SweetDroid is also fine-grained - it is at the individual API level, not at the permission level. We implement and evaluate the system based on thousands of Android apps, including those from a third-party market and malicious apps from VirusTotal. Our experiment results show that SweetDroid can successfully distinguish and enforce different privacy policies based on calling contexts, and the current design is both developer hassle-free and user transparent. SweetDroid is also efficient because it only introduces small storage and computational overhead.
Many organizations process personal information in the course of normal operations. Improper disclosure of this information can be damaging, so organizations must obey privacy laws and regulations that impose restrictions on its release or risk penalties. Since electronic management of personal information must be held in strict compliance with the law, software systems designed for such purposes must have some guarantee of compliance. To support this, we develop a general methodology for designing and implementing verifiable information systems. This paper develops the design of the History Aware Programming Language into a framework for creating systems that can be mechanically checked against privacy specifications. We apply this framework to create and verify a prototypical Electronic Medical Record System (EMRS) expressed as a set of actor components and first-order linear temporal logic specifications in assume-guarantee form. We then show that the implementation of the EMRS provably enforces a formalized Health Insurance Portability and Accountability Act (HIPAA) policy using a combination of model checking and static analysis techniques.
Nowadays Big data is considered as one of the major technologies used to manage a huge number of data, but there is little consideration of privacy in big data platforms. Indeed, developers don't focus on implementing security best practices in their programs to protect personal and sensitive data, and organizations can face financial lost because of this noncompliance with applied regulations. In this paper, we propose a solution to insert privacy policies written in XACML (eXtensible Access Control Markup Language) in access control solution to NoSQL database, our solution can be used for NoSQL data store which doesn't t include many access control features, it aims basically to ensure fine grained access control considering purpose as the main parameter, we will focus on access control in document level, and apply this approach to MongoDB which is the most used NoSQL data store.
Security and privacy of big data becomes challenging as data grows and more accessible by more and more clients. Large-scale data storage is becoming a necessity for healthcare, business segments, government departments, scientific endeavors and individuals. Our research will focus on the privacy, security and how we can make sure that big data is secured. Managing security policy is a challenge that our framework will handle for big data. Privacy policy needs to be integrated, flexible, context-aware and customizable. We will build a framework to receive data from customer and then analyze data received, extract privacy policy and then identify the sensitive data. In this paper we will present the techniques for privacy policy which will be created to be used in our framework.
A fundamental problem in the specification of regulatory privacy policies such as the Health Insurance Portability and Accountability Act (HIPAA) in a computer system is to state the policies precisely, consistent with their high-level intuition. In this paper, we propose UML sequence diagrams as a practical means to graphically express privacy policies. A graphical representation allows decision-makers such as application domain experts and security architects to easily verify and confirm the expected behavior. Once intuitively confirmed, our work in this article introduces an algorithmic approach to formalizing the semantics of sequence diagrams in terms of linear temporal logic (LTL) templates. In all the templates, different semantic aspects are expressed as separate, yet simple LTL formulas that can be composed to define the complex semantics of sequence diagrams. The formalization enables us to leverage the analytical powers of automated decision procedures for LTL formulas to determine if a collection of sequence diagrams is consistent, independent, etc. and also to verify if a system design conforms to the privacy policies. We evaluate our approach by modeling and analyzing a substantial subset of HIPAA rules using sequence diagrams.
Requirements analysts can model regulated data practices to identify and reason about risks of noncompliance. If terminology is inconsistent or ambiguous, however, these models and their conclusions will be unreliable. To study this problem, we investigated an approach to automatically construct an information type ontology by identifying information type hyponymy in privacy policies using Tregex patterns. Tregex is a utility to match regular expressions against constituency parse trees, which are hierarchical expressions of natural language clauses, including noun and verb phrases. We discovered the Tregex patterns by applying content analysis to 15 privacy policies from three domains (shopping, telecommunication and social networks) to identify all instances of information type hyponymy. From this dataset, three semantic and four syntactic categories of hyponymy emerged based on category completeness and word-order. Among these, we identified and empirically evaluated 26 Tregex patterns to automate the extraction of hyponyms from privacy policies. The patterns identify information type hypernym-hyponym pairs with an average precision of 0.83 and recall of 0.52 across our dataset of 15 policies.
When customers purchase a product or sign up for service from a company, they often are required to agree to a Privacy Policy or Terms of Service agreement. Many of these policies are lengthy, and a typical customer agrees to them without reading them carefully if at all. To address this problem, we have developed a prototype automatic text summarization system which is specifically designed for privacy policies. Our system generates a summary of a policy statement by identifying important sentences from the statement, categorizing these sentences by which of 5 "statement categories" the sentence addresses, and displaying to a user a list of the sentences which match each category. Our system incorporates keywords identified by a human domain expert and rules that were obtained by machine learning, and they are combined in an ensemble architecture. We have tested our system on a sample corpus of privacy statements, and preliminary results are promising.
Unmanned Aerial Systems (UAS) have raised a great concern on privacy recently. A practical method to protect privacy is needed for adopting UAS in civilian airspace. This paper examines the privacy policies, filtering strategies, existing techniques, then proposes a novel method based on the encrypted video stream and the cloud-based privacy servers. In this scheme, all video surveillance images are initially encrypted, then delivered to a privacy server. The privacy server decrypts the video using the shared key with the camera, and filters the image according to the privacy policy specified for the surveyed region. The sanitized video is delivered to the surveillance operator or anyone on the Internet who is authorized. In a larger system composed of multiple cameras and multiple privacy servers, the keys can be distributed using Kerberos protocol. With this method the privacy policy can be changed on demand in real-time and there is no need for a costly on-board processing unit. By utilizing the cloud-based servers, advanced image processing algorithms and new filtering algorithms can be applied immediately without upgrading the camera software. This method is cost-efficient and promotes video sharing among multiple subscribers, thus it can spur wide adoption.