Visible to the public Analyzing Metadata in PDF Files Published by Police Agencies in Japan

TitleAnalyzing Metadata in PDF Files Published by Police Agencies in Japan
Publication TypeConference Paper
Year of Publication2022
AuthorsHasegawa, Taichi, Saito, Taiichi, Sasaki, Ryoichi
Conference Name2022 IEEE 22nd International Conference on Software Quality, Reliability, and Security Companion (QRS-C)
Keywordscomponent, composability, compositionality, Computer architecture, Data Sanitization, hidden data, law enforcement, metadata, Open Data, Organizations, PDF files, pubcrawl, resilience, Resiliency, sanitization, security, software quality, software reliability, targeted attacks
AbstractIn recent years, new types of cyber attacks called targeted attacks have been observed. It targets specific organizations or individuals, while usual large-scale attacks do not focus on specific targets. Organizations have published many Word or PDF files on their websites. These files may provide the starting point for targeted attacks if they include hidden data unintentionally generated in the authoring process. Adhatarao and Lauradoux analyzed hidden data found in the PDF files published by security agencies in many countries and showed that many PDF files potentially leak information like author names, details on the information system and computer architecture. In this study, we analyze hidden data of PDF files published on the website of police agencies in Japan and compare the results with Adhatarao and Lauradoux's. We gathered 110989 PDF files. 56% of gathered PDF files contain personal names, organization names, usernames, or numbers that seem to be IDs within the organizations. 96% of PDF files contain software names.
NotesISSN: 2693-9371
DOI10.1109/QRS-C57518.2022.00029
Citation Keyhasegawa_analyzing_2022