PRETSA: Event Log Sanitization for Privacy-aware Process Discovery
Title | PRETSA: Event Log Sanitization for Privacy-aware Process Discovery |
Publication Type | Conference Paper |
Year of Publication | 2019 |
Authors | Fahrenkrog-Petersen, Stephan A., van der Aa, Han, Weidlich, Matthias |
Conference Name | 2019 International Conference on Process Mining (ICPM) |
Date Published | jun |
Publisher | IEEE |
ISBN Number | 978-1-7281-0919-0 |
Keywords | Analytical models, Business, business data processing, compositionality, data deletion, data mining, data privacy, Data Sanitization, Differential privacy, event data, event log sanitization, Human Behavior, human factors, Information systems, performance-annotated process model, Personnel, PRETSA, privacy, privacy guarantees, privacy-aware process discovery, Privacy-aware Process Mining, privacy-disclosure attacks, process discovery, process execution, process model discovery, pseudonymized employee information, pubcrawl, resilience, Resiliency, Scalability, security of data, sensitive information |
Abstract | Event logs that originate from information systems enable comprehensive analysis of business processes, e.g., by process model discovery. However, logs potentially contain sensitive information about individual employees involved in process execution that are only partially hidden by an obfuscation of the event data. In this paper, we therefore address the risk of privacy-disclosure attacks on event logs with pseudonymized employee information. To this end, we introduce PRETSA, a novel algorithm for event log sanitization that provides privacy guarantees in terms of k-anonymity and t-closeness. It thereby avoids disclosure of employee identities, their membership in the event log, and their characterization based on sensitive attributes, such as performance information. Through step-wise transformations of a prefix-tree representation of an event log, we maintain its high utility for discovery of a performance-annotated process model. Experiments with real-world data demonstrate that sanitization with PRETSA yields event logs of higher utility compared to methods that exploit frequency-based filtering, while providing the same privacy guarantees. |
URL | https://ieeexplore.ieee.org/document/8786060 |
DOI | 10.1109/ICPM.2019.00012 |
Citation Key | fahrenkrog-petersen_pretsa_2019 |
- PRETSA
- sensitive information
- security of data
- Scalability
- Resiliency
- resilience
- pubcrawl
- pseudonymized employee information
- process model discovery
- process execution
- process discovery
- privacy-disclosure attacks
- Privacy-aware Process Mining
- privacy-aware process discovery
- privacy guarantees
- privacy
- Analytical models
- Personnel
- performance-annotated process model
- Information systems
- Human Factors
- Human behavior
- event log sanitization
- event data
- differential privacy
- Data Sanitization
- data privacy
- Data mining
- data deletion
- Compositionality
- business data processing
- Business