Visible to the public PRETSA: Event Log Sanitization for Privacy-aware Process Discovery

TitlePRETSA: Event Log Sanitization for Privacy-aware Process Discovery
Publication TypeConference Paper
Year of Publication2019
AuthorsFahrenkrog-Petersen, Stephan A., van der Aa, Han, Weidlich, Matthias
Conference Name2019 International Conference on Process Mining (ICPM)
Date Publishedjun
PublisherIEEE
ISBN Number978-1-7281-0919-0
KeywordsAnalytical models, Business, business data processing, compositionality, data deletion, data mining, data privacy, Data Sanitization, Differential privacy, event data, event log sanitization, Human Behavior, human factors, Information systems, performance-annotated process model, Personnel, PRETSA, privacy, privacy guarantees, privacy-aware process discovery, Privacy-aware Process Mining, privacy-disclosure attacks, process discovery, process execution, process model discovery, pseudonymized employee information, pubcrawl, resilience, Resiliency, Scalability, security of data, sensitive information
Abstract

Event logs that originate from information systems enable comprehensive analysis of business processes, e.g., by process model discovery. However, logs potentially contain sensitive information about individual employees involved in process execution that are only partially hidden by an obfuscation of the event data. In this paper, we therefore address the risk of privacy-disclosure attacks on event logs with pseudonymized employee information. To this end, we introduce PRETSA, a novel algorithm for event log sanitization that provides privacy guarantees in terms of k-anonymity and t-closeness. It thereby avoids disclosure of employee identities, their membership in the event log, and their characterization based on sensitive attributes, such as performance information. Through step-wise transformations of a prefix-tree representation of an event log, we maintain its high utility for discovery of a performance-annotated process model. Experiments with real-world data demonstrate that sanitization with PRETSA yields event logs of higher utility compared to methods that exploit frequency-based filtering, while providing the same privacy guarantees.

URLhttps://ieeexplore.ieee.org/document/8786060
DOI10.1109/ICPM.2019.00012
Citation Keyfahrenkrog-petersen_pretsa_2019