Visible to the public Biblio

Filters: Author is Hofmann, Peter  [Clear All Filters]
2019-11-11
Tesfay, Welderufael B., Hofmann, Peter, Nakamura, Toru, Kiyomoto, Shinsaku, Serna, Jetzabel.  2018.  I Read but Don'T Agree: Privacy Policy Benchmarking Using Machine Learning and the EU GDPR. Companion Proceedings of the The Web Conference 2018. :163–166.
With the continuing growth of the Internet landscape, users share large amount of personal, sometimes, privacy sensitive data. When doing so, often, users have little or no clear knowledge about what service providers do with the trails of personal data they leave on the Internet. While regulations impose rather strict requirements that service providers should abide by, the defacto approach seems to be communicating data processing practices through privacy policies. However, privacy policies are long and complex for users to read and understand, thus failing their mere objective of informing users about the promised data processing behaviors of service providers. To address this pertinent issue, we propose a machine learning based approach to summarize the rather long privacy policy into short and condensed notes following a risk-based approach and using the European Union (EU) General Data Protection Regulation (GDPR) aspects as assessment criteria. The results are promising and indicate that our tool can summarize lengthy privacy policies in a short period of time, thus supporting users to take informed decisions regarding their information disclosure behaviors.
2019-02-14
Tesfay, Welderufael B., Hofmann, Peter, Nakamura, Toru, Kiyomoto, Shinsaku, Serna, Jetzabel.  2018.  PrivacyGuide: Towards an Implementation of the EU GDPR on Internet Privacy Policy Evaluation. Proceedings of the Fourth ACM International Workshop on Security and Privacy Analytics. :15-21.

Nowadays Internet services have dramatically changed the way people interact with each other and many of our daily activities are supported by those services. Statistical indicators show that more than half of the world's population uses the Internet generating about 2.5 quintillion bytes of data on daily basis. While such a huge amount of data is useful in a number of fields, such as in medical and transportation systems, it also poses unprecedented threats for user's privacy. This is aggravated by the excessive data collection and user profiling activities of service providers. Yet, regulation require service providers to inform users about their data collection and processing practices. The de facto way of informing users about these practices is through the use of privacy policies. Unfortunately, privacy policies suffer from bad readability and other complexities which make them unusable for the intended purpose. To address this issue, we introduce PrivacyGuide, a privacy policy summarization tool inspired by the European Union (EU) General Data Protection Regulation (GDPR) and based on machine learning and natural language processing techniques. Our results show that PrivacyGuide is able to classify privacy policy content into eleven privacy aspects with a weighted average accuracy of 74% and further shed light on the associated risk level with an accuracy of 90%. This article is summarized in: the morning paper an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer