Biblio
Nowadays Internet services have dramatically changed the way people interact with each other and many of our daily activities are supported by those services. Statistical indicators show that more than half of the world's population uses the Internet generating about 2.5 quintillion bytes of data on daily basis. While such a huge amount of data is useful in a number of fields, such as in medical and transportation systems, it also poses unprecedented threats for user's privacy. This is aggravated by the excessive data collection and user profiling activities of service providers. Yet, regulation require service providers to inform users about their data collection and processing practices. The de facto way of informing users about these practices is through the use of privacy policies. Unfortunately, privacy policies suffer from bad readability and other complexities which make them unusable for the intended purpose. To address this issue, we introduce PrivacyGuide, a privacy policy summarization tool inspired by the European Union (EU) General Data Protection Regulation (GDPR) and based on machine learning and natural language processing techniques. Our results show that PrivacyGuide is able to classify privacy policy content into eleven privacy aspects with a weighted average accuracy of 74% and further shed light on the associated risk level with an accuracy of 90%. This article is summarized in: the morning paper an interesting/influential/important paper from the world of CS every weekday morning, as selected by Adrian Colyer