Visible to the public Biblio

Found 5182 results

Filters: First Letter Of Last Name is S  [Clear All Filters]
2020-04-06
Zhou, Yejun, Qiu, Lede, Yu, Hang, Sun, Chunhui.  2018.  Study on Security Technology of Internet of Things Based on Network Coding. 2018 IEEE Third International Conference on Data Science in Cyberspace (DSC). :353–357.
Along with the continuous progress of the information technology, Internet of Things is the inevitable way for realizing the fusion of communication and traditional network technology. Network coding, an important breakthrough in the field of communication, has many applied advantages in information network. This article analyses the eavesdropping problem of Internet of Things and presents an information secure network coding scheme against the eavesdropping adversaries. We show that, if the number of links the adversaries can eavesdrop on is less than the max-flow of a network, the proposed coding scheme not only `achieves the prefect information secure condition but also the max-flow of the network.
2020-04-03
Zhou, Hai, Rezaei, Amin, Shen, Yuanqi.  2019.  Resolving the Trilemma in Logic Encryption. 2019 IEEE/ACM International Conference on Computer-Aided Design (ICCAD). :1—8.

Logic encryption, a method to lock a circuit from unauthorized use unless the correct key is provided, is the most important technique in hardware IP protection. However, with the discovery of the SAT attack, all traditional logic encryption algorithms are broken. New algorithms after the SAT attack are all vulnerable to structural analysis unless a provable obfuscation is applied to the locked circuit. But there is no provable logic obfuscation available, in spite of some vague resorting to logic resynthesis. In this paper, we formulate and discuss a trilemma in logic encryption among locking robustness, structural security, and encryption efficiency, showing that pre-SAT approaches achieve only structural security and encryption efficiency, and post-SAT approaches achieve only locking robustness and encryption efficiency. There is also a dilemma between query complexity and error number in locking. We first develop a theory and solution to the dilemma in locking between query complexity and error number. Then, we provide a provable obfuscation solution to the dilemma between structural security and locking robustness. We finally present and discuss some results towards the resolution of the trilemma in logic encryption.

Šišejković, Dominik, Merchant, Farhad, Leupers, Rainer, Ascheid, Gerd, Kiefer, Volker.  2019.  A Critical Evaluation of the Paradigm Shift in the Design of Logic Encryption Algorithms. 2019 International Symposium on VLSI Design, Automation and Test (VLSI-DAT). :1—4.
The globalization of the integrated circuit supply chain has given rise to major security concerns ranging from intellectual property piracy to hardware Trojans. Logic encryption is a promising solution to tackle these threats. Recently, a Boolean satisfiability attack capable of unlocking existing logic encryption techniques was introduced. This attack initiated a paradigm shift in the design of logic encryption algorithms. However, recent approaches have been strongly focusing on low-cost countermeasures that unfortunately lead to low functional and structural corruption. In this paper, we show that a simple approach can offer provable security and more than 99% corruption if a higher area overhead is accepted. Our results strongly suggest that future proposals should consider higher overheads or more realistic circuit sizes for the evaluation of modern logic encryption algorithms.
Kuznetsov, Alexandr, Kiian, Anastasiia, Gorbenko, Yurii, Smirnov, Oleksii, Cherep, Oleksandr, Bexhter, Liliia.  2019.  Code-based Pseudorandom Generator for the Post-Quantum Period. 2019 IEEE International Conference on Advanced Trends in Information Theory (ATIT). :204—209.
This paper focuses on research of a provably secure code-based pseudorandom sequence generators whose cryptanalysis problem equals to syndrome decoding (belonging to the NP-complex class). It was found that generated sequences of such well-known Fischer-Stern code-based generator don’t have a maximum period, the actual period is much lower than expected. In our work, we have created a new generator scheme. It retains all advantages of the Fisher-Stern algorithm and provides pseudorandom sequences which are formed with maximum period. Also comparative analysis of proposed generator and popular generators was conducted.
Song, Liwei, Shokri, Reza, Mittal, Prateek.  2019.  Membership Inference Attacks Against Adversarially Robust Deep Learning Models. 2019 IEEE Security and Privacy Workshops (SPW). :50—56.
In recent years, the research community has increasingly focused on understanding the security and privacy challenges posed by deep learning models. However, the security domain and the privacy domain have typically been considered separately. It is thus unclear whether the defense methods in one domain will have any unexpected impact on the other domain. In this paper, we take a step towards enhancing our understanding of deep learning models when the two domains are combined together. We do this by measuring the success of membership inference attacks against two state-of-the-art adversarial defense methods that mitigate evasion attacks: adversarial training and provable defense. On the one hand, membership inference attacks aim to infer an individual's participation in the target model's training dataset and are known to be correlated with target model's overfitting. On the other hand, adversarial defense methods aim to enhance the robustness of target models by ensuring that model predictions are unchanged for a small area around each sample in the training dataset. Intuitively, adversarial defenses may rely more on the training dataset and be more vulnerable to membership inference attacks. By performing empirical membership inference attacks on both adversarially robust models and corresponding undefended models, we find that the adversarial training method is indeed more susceptible to membership inference attacks, and the privacy leakage is directly correlated with model robustness. We also find that the provable defense approach does not lead to enhanced success of membership inference attacks. However, this is achieved by significantly sacrificing the accuracy of the model on benign data points, indicating that privacy, security, and prediction accuracy are not jointly achieved in these two approaches.
Cheang, Kevin, Rasmussen, Cameron, Seshia, Sanjit, Subramanyan, Pramod.  2019.  A Formal Approach to Secure Speculation. 2019 IEEE 32nd Computer Security Foundations Symposium (CSF). :288—28815.
Transient execution attacks like Spectre, Meltdown and Foreshadow have shown that combinations of microarchitectural side-channels can be synergistically exploited to create side-channel leaks that are greater than the sum of their parts. While both hardware and software mitigations have been proposed against these attacks, provable security has remained elusive. This paper introduces a formal methodology for enabling secure speculative execution on modern processors. We propose a new class of information flow security properties called trace property-dependent observational determinism (TPOD). We use this class to formulate a secure speculation property. Our formulation precisely characterises all transient execution vulnerabilities. We demonstrate its applicability by verifying secure speculation for several illustrative programs.
Hirose, Shoichi, Shikata, Junji.  2019.  Provable Security of the Ma-Tsudik Forward-Secure Sequential Aggregate MAC Scheme. 2019 Seventh International Symposium on Computing and Networking Workshops (CANDARW). :327—332.
Considering application to communication among wireless sensors, Ma and Tsudik introduced the notion of forward-secure sequential aggregate (FssAgg) authentication in 2007. They also proposed an FssAgg MAC scheme composed of a MAC function and cryptographic hash functions at the same time. The security of their proposed scheme has not been analyzed yet and remains open. It is shown in this paper that a slight variant of the Ma-Tsudik FssAgg MAC scheme is secure under reasonable and standard assumptions on security of the underlying primitives. An efficient instantiation of the underlying MAC function using a cryptographic hash function is also discussed.
Zhao, Hui, Li, Zhihui, Wei, Hansheng, Shi, Jianqi, Huang, Yanhong.  2019.  SeqFuzzer: An Industrial Protocol Fuzzing Framework from a Deep Learning Perspective. 2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST). :59—67.

Industrial networks are the cornerstone of modern industrial control systems. Performing security checks of industrial communication processes helps detect unknown risks and vulnerabilities. Fuzz testing is a widely used method for performing security checks that takes advantage of automation. However, there is a big challenge to carry out security checks on industrial network due to the increasing variety and complexity of industrial communication protocols. In this case, existing approaches usually take a long time to model the protocol for generating test cases, which is labor-intensive and time-consuming. This becomes even worse when the target protocol is stateful. To help in addressing this problem, we employed a deep learning model to learn the structures of protocol frames and deal with the temporal features of stateful protocols. We propose a fuzzing framework named SeqFuzzer which automatically learns the protocol frame structures from communication traffic and generates fake but plausible messages as test cases. For proving the usability of our approach, we applied SeqFuzzer to widely-used Ethernet for Control Automation Technology (EtherCAT) devices and successfully detected several security vulnerabilities.

Aires Urquiza, Abraão, AlTurki, Musab A., Kanovich, Max, Ban Kirigin, Tajana, Nigam, Vivek, Scedrov, Andre, Talcott, Carolyn.  2019.  Resource-Bounded Intruders in Denial of Service Attacks. 2019 IEEE 32nd Computer Security Foundations Symposium (CSF). :382—38214.

Denial of Service (DoS) attacks have been a serious security concern, as no service is, in principle, protected against them. Although a Dolev-Yao intruder with unlimited resources can trivially render any service unavailable, DoS attacks do not necessarily have to be carried out by such (extremely) powerful intruders. It is useful in practice and more challenging for formal protocol verification to determine whether a service is vulnerable even to resource-bounded intruders that cannot generate or intercept arbitrary large volumes of traffic. This paper proposes a novel, more refined intruder model where the intruder can only consume at most some specified amount of resources in any given time window. Additionally, we propose protocol theories that may contain timeouts and specify service resource usage during protocol execution. In contrast to the existing resource-conscious protocol verification models, our model allows finer and more subtle analysis of DoS problems. We illustrate the power of our approach by representing a number of classes of DoS attacks, such as, Slow, Asymmetric and Amplification DoS attacks, exhausting different types of resources of the target, such as, number of workers, processing power, memory, and network bandwidth. We show that the proposed DoS problem is undecidable in general and is PSPACE-complete for the class of resource-bounded, balanced systems. Finally, we implemented our formal verification model in the rewriting logic tool Maude and analyzed a number of DoS attacks in Maude using Rewriting Modulo SMT in an automated fashion.

Sadique, Farhan, Bakhshaliyev, Khalid, Springer, Jeff, Sengupta, Shamik.  2019.  A System Architecture of Cybersecurity Information Exchange with Privacy (CYBEX-P). 2019 IEEE 9th Annual Computing and Communication Workshop and Conference (CCWC). :0493—0498.
Rapid evolution of cyber threats and recent trends in the increasing number of cyber-attacks call for adopting robust and agile cybersecurity techniques. Cybersecurity information sharing is expected to play an effective role in detecting and defending against new attacks. However, reservations and or-ganizational policies centering the privacy of shared data have become major setbacks in large-scale collaboration in cyber defense. The situation is worsened by the fact that the benefits of cyber-information exchange are not realized unless many actors participate. In this paper, we argue that privacy preservation of shared threat data will motivate entities to share threat data. Accordingly, we propose a framework called CYBersecurity information EXchange with Privacy (CYBEX-P) to achieve this. CYBEX-P is a structured information sharing platform with integrating privacy-preserving mechanisms. We propose a complete system architecture for CYBEX-P that guarantees maximum security and privacy of data. CYBEX-P outlines the details of a cybersecurity information sharing platform. The adoption of blind processing, privacy preservation, and trusted computing paradigms make CYBEX-P a versatile and secure information exchange platform.
Bello-Ogunu, Emmanuel, Shehab, Mohamed, Miazi, Nazmus Sakib.  2019.  Privacy Is The Best Policy: A Framework for BLE Beacon Privacy Management. 2019 IEEE 43rd Annual Computer Software and Applications Conference (COMPSAC). 1:823—832.
Bluetooth Low Energy (BLE) beacons are an emerging type of technology in the Internet-of-Things (IoT) realm, which use BLE signals to broadcast a unique identifier that is detected by a compatible device to determine the location of nearby users. Beacons can be used to provide a tailored user experience with each encounter, yet can also constitute an invasion of privacy, due to their covertness and ability to track user behavior. Therefore, we hypothesize that user-driven privacy policy configuration is key to enabling effective and trustworthy privacy management during beacon encounters. We developed a framework for beacon privacy management that provides a policy configuration platform. Through an empirical analysis with 90 users, we evaluated this framework through a proof-of-concept app called Beacon Privacy Manager (BPM), which focused on the user experience of such a tool. Using BPM, we provided users with the ability to create privacy policies for beacons, testing different configuration schemes to refine the framework and then offer recommendations for future research.
Sattar, Naw Safrin, Arifuzzaman, Shaikh, Zibran, Minhaz F., Sakib, Md Mohiuddin.  2019.  An Ensemble Approach for Suspicious Traffic Detection from High Recall Network Alerts. {2019 IEEE International Conference on Big Data (Big Data. :4299—4308}}@inproceedings{wu_ensemble_2019.
Web services from large-scale systems are prevalent all over the world. However, these systems are naturally vulnerable and incline to be intruded by adversaries for illegal benefits. To detect anomalous events, previous works focus on inspecting raw system logs by identifying the outliers in workflows or relying on machine learning methods. Though those works successfully identify the anomalies, their models use large training set and process whole system logs. To reduce the quantity of logs that need to be processed, high recall suspicious network alert systems can be applied to preprocess system logs. Only the logs that trigger alerts are retrieved for further usage. Due to the universally usage of network traffic alerts among Security Operations Center, anomalies detection problems could be transformed to classify truly suspicious network traffic alerts from false alerts.In this work, we propose an ensemble model to distinguish truly suspicious alerts from false alerts. Our model consists of two sub-models with different feature extraction strategies to ensure the diversity and generalization. We use decision tree based boosters and deep neural networks to build ensemble models for classification. Finally, we evaluate our approach on suspicious network alerts dataset provided by 2019 IEEE BigData Cup: Suspicious Network Event Recognition. Under the metric of AUC scores, our model achieves 0.9068 on the whole testing set.
Saridou, Betty, Shiaeles, Stavros, Papadopoulos, Basil.  2019.  DDoS Attack Mitigation through Root-DNS Server: A Case Study. 2019 IEEE World Congress on Services (SERVICES). 2642-939X:60—65.

Load balancing and IP anycast are traffic routing algorithms used to speed up delivery of the Domain Name System. In case of a DDoS attack or an overload condition, the value of these protocols is critical, as they can provide intrinsic DDoS mitigation with the failover alternatives. In this paper, we present a methodology for predicting the next DNS response in the light of a potential redirection to less busy servers, in order to mitigate the size of the attack. Our experiments were conducted using data from the Nov. 2015 attack of the Root DNS servers and Logistic Regression, k-Nearest Neighbors, Support Vector Machines and Random Forest as our primary classifiers. The models were able to successfully predict up to 83% of responses for Root Letters that operated on a small number of sites and consequently suffered the most during the attacks. On the other hand, regarding DNS requests coming from more distributed Root servers, the models demonstrated lower accuracy. Our analysis showed a correlation between the True Positive Rate metric and the number of sites, as well as a clear need for intelligent management of traffic in load balancing practices.

Gerking, Christopher, Schubert, David.  2019.  Component-Based Refinement and Verification of Information-Flow Security Policies for Cyber-Physical Microservice Architectures. 2019 IEEE International Conference on Software Architecture (ICSA). :61—70.

Since cyber-physical systems are inherently vulnerable to information leaks, software architects need to reason about security policies to define desired and undesired information flow through a system. The microservice architectural style requires the architects to refine a macro-level security policy into micro-level policies for individual microservices. However, when policies are refined in an ill-formed way, information leaks can emerge on composition of microservices. Related approaches to prevent such leaks do not take into account characteristics of cyber-physical systems like real-time behavior or message passing communication. In this paper, we enable the refinement and verification of information-flow security policies for cyber-physical microservice architectures. We provide architects with a set of well-formedness rules for refining a macro-level policy in a way that enforces its security restrictions. Based on the resulting micro-level policies, we present a verification technique to check if the real-time message passing of microservices is secure. In combination, our contributions prevent information leaks from emerging on composition. We evaluate the accuracy of our approach using an extension of the CoCoME case study.

Belim, S.V., S.Yu, Belim.  2019.  Implementation of Discretionary Security Policy in the Distributed Systems Based on the Secret Sharing Scheme. 2019 International Multi-Conference on Industrial Engineering and Modern Technologies (FarEastCon). :1—4.

In this article the combination of secret sharing schemes and the requirement of discretionary security policy is considered. Secret sharing schemes of Shamir and Blakley are investigated. Conditions for parameters of schemes the providing forbidden information channels are received. Ways for concealment of the forbidden channels are suggested. Three modifications of the Shamir's scheme and two modifications of the Blakley's scheme are suggested. Transition from polynoms to exponential functions for formation the parts of a secret is carried out. The problem of masking the presence of the forbidden information channels is solved. Several approaches with the complete and partial concealment are suggested.

Singi, Kapil, Kaulgud, Vikrant, Bose, R.P. Jagadeesh Chandra, Podder, Sanjay.  2019.  CAG: Compliance Adherence and Governance in Software Delivery Using Blockchain. 2019 IEEE/ACM 2nd International Workshop on Emerging Trends in Software Engineering for Blockchain (WETSEB). :32—39.

The software development life cycle (SDLC) starts with business and functional specifications signed with a client. In addition to this, the specifications also capture policy / procedure / contractual / regulatory / legislation / standard compliances with respect to a given client industry. The SDLC must adhere to service level agreements (SLAs) while being compliant to development activities, processes, tools, frameworks, and reuse of open-source software components. In today's world, global software development happens across geographically distributed (autonomous) teams consuming extraordinary amounts of open source components drawn from a variety of disparate sources. Although this is helping organizations deal with technical and economic challenges, it is also increasing unintended risks, e.g., use of a non-complaint license software might lead to copyright issues and litigations, use of a library with vulnerabilities pose security risks etc. Mitigation of such risks and remedial measures is a challenge due to lack of visibility and transparency of activities across these distributed teams as they mostly operate in silos. We believe a unified model that non-invasively monitors and analyzes the activities of distributed teams will help a long way in building software that adhere to various compliances. In this paper, we propose a decentralized CAG - Compliance Adherence and Governance framework using blockchain technologies. Our framework (i) enables the capturing of required data points based on compliance specifications, (ii) analyzes the events for non-conformant behavior through smart contracts, (iii) provides real-time alerts, and (iv) records and maintains an immutable audit trail of various activities.

Kantarcioglu, Murat, Shaon, Fahad.  2019.  Securing Big Data in the Age of AI. 2019 First IEEE International Conference on Trust, Privacy and Security in Intelligent Systems and Applications (TPS-ISA). :218—220.

Increasingly organizations are collecting ever larger amounts of data to build complex data analytics, machine learning and AI models. Furthermore, the data needed for building such models may be unstructured (e.g., text, image, and video). Hence such data may be stored in different data management systems ranging from relational databases to newer NoSQL databases tailored for storing unstructured data. Furthermore, data scientists are increasingly using programming languages such as Python, R etc. to process data using many existing libraries. In some cases, the developed code will be automatically executed by the NoSQL system on the stored data. These developments indicate the need for a data security and privacy solution that can uniformly protect data stored in many different data management systems and enforce security policies even if sensitive data is processed using a data scientist submitted complex program. In this paper, we introduce our vision for building such a solution for protecting big data. Specifically, our proposed system system allows organizations to 1) enforce policies that control access to sensitive data, 2) keep necessary audit logs automatically for data governance and regulatory compliance, 3) sanitize and redact sensitive data on-the-fly based on the data sensitivity and AI model needs, 4) detect potentially unauthorized or anomalous access to sensitive data, 5) automatically create attribute-based access control policies based on data sensitivity and data type.

2020-03-31
Madiha Tabassum, Tomasz Kosiundefinedski, Alisa Frik, Nathan Malkin, Primal Wijesekera, Serge Egelman, Heather Lipford.  2019.  Investigating Users’ Preferences and Expectations for Always-Listening Voice Assistants. Proc. ACM Interact. Mob. Wearable Ubiquitous Technol.. 3(4):23.

Many consumers now rely on different forms of voice assistants, both stand-alone devices and those built into smartphones. Currently, these systems react to specific wake-words, such as "Alexa," "Siri," or "Ok Google." However, with advancements in natural language processing, the next generation of voice assistants could instead always listen to the acoustic environment and proactively provide services and recommendations based on conversations without being explicitly invoked. We refer to such devices as "always listening voice assistants" and explore expectations around their potential use. In this paper, we report on a 178-participant survey investigating the potential services people anticipate from such a device and how they feel about sharing their data for these purposes. Our findings reveal that participants can anticipate a wide range of services pertaining to a conversation; however, most of the services are very similar to those that existing voice assistants currently provide with explicit commands. Participants are more likely to consent to share a conversation when they do not find it sensitive, they are comfortable with the service and find it beneficial, and when they already own a stand-alone voice assistant. Based on our findings we discuss the privacy challenges in designing an always-listening voice assistant.

Nathan Malkin, Primal Wijesekera, Serge Egelman, David Wagner.  2018.  Use Case: Passively Listening Personal Assistants. Symposium on Applications of Contextual Integrity. :26-27.
2020-03-30
Heigl, Michael, Schramm, Martin, Fiala, Dalibor.  2019.  A Lightweight Quantum-Safe Security Concept for Wireless Sensor Network Communication. 2019 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops). :906–911.

The ubiquitous internetworking of devices in all areas of life is boosted by various trends for instance the Internet of Things. Promising technologies that can be used for such future environments come from Wireless Sensor Networks. It ensures connectivity between distributed, tiny and simple sensor nodes as well as sensor nodes and base stations in order to monitor physical or environmental conditions such as vibrations, temperature or motion. Security plays an increasingly important role in the coming decades in which attacking strategies are becoming more and more sophisticated. Contemporary cryptographic mechanisms face a great threat from quantum computers in the near future and together with Intrusion Detection Systems are hardly applicable on sensors due to strict resource constraints. Thus, in this work a future-proof lightweight and resource-aware security concept for sensor networks with a processing stage permeated filtering mechanism is proposed. A special focus in the concepts evaluation lies on the novel Magic Number filter to mitigate a special kind of Denial-of-Service attack performed on CC1350 LaunchPad ARM Cortex-M3 microcontroller boards.

Hu, Zhengbing, Vasiliu, Yevhen, Smirnov, Oleksii, Sydorenko, Viktoriia, Polishchuk, Yuliia.  2019.  Abstract Model of Eavesdropper and Overview on Attacks in Quantum Cryptography Systems. 2019 10th IEEE International Conference on Intelligent Data Acquisition and Advanced Computing Systems: Technology and Applications (IDAACS). 1:399–405.
In today's world, it's almost impossible to find a sphere of human life in which information technologies would not be used. On the one hand, it simplifies human life - virtually everyone carries a mini-computer in his pocket and it allows to perform many operations, that took a lot of time, in minutes. In addition, IT has simplified and promptly developed areas such as medicine, banking, document circulation, military, and many other infrastructures of the state. Nevertheless, even today, privacy remains a major problem in many information transactions. One of the most important directions for ensuring the information confidentiality in open communication networks has been and remains its protection by cryptographic methods. Although it is known that traditional cryptography methods give reasons to doubt in their reliability, quantum cryptography has proven itself as a more reliable information security technology. As far is it quite new direction there is no sufficiently complete classification of attacks on quantum cryptography methods, in view of this new extended classification of attacks on quantum protocols and quantum cryptosystems is proposed in this work. Classification takes into account the newest attacks (which use devices loopholes) on quantum key distribution equipment. These attacks have been named \textbackslashtextless; \textbackslashtextless; quantum hacking\textbackslashtextgreater\textbackslashtextgreater. Such classification may be useful for choosing commercially available quantum key distribution system. Also abstract model of eavesdropper in quantum systems was created and it allows to determine a set of various nature measures that need to be further implemented to provide reliable security with the help of specific quantum systems.
Ximenes, Agostinho Marques, Sukaridhoto, Sritrusta, Sudarsono, Amang, Ulil Albaab, Mochammad Rifki, Basri, Hasan, Hidayat Yani, Muhammad Aksa, Chang Choon, Chew, Islam, Ezharul.  2019.  Implementation QR Code Biometric Authentication for Online Payment. 2019 International Electronics Symposium (IES). :676–682.
Based on the Indonesian of Statistics the level of society people in 2019 is grow up. Based on data, the bank conducted a community to simple transaction payment in the market. Bank just used a debit card or credit card for the transaction, but the banks need more investment for infrastructure and very expensive. Based on that cause the bank needs another solution for low-cost infrastructure. Obtained from solutions that, the bank implementation QR Code Biometric authentication Payment Online is one solution that fulfills. This application used for payment in online merchant. The transaction permits in this study lie in the biometric encryption, or decryption transaction permission and QR Code Scan to improve communication security and transaction data. The test results of implementation Biometric Cloud Authentication Platform show that AES 256 agents can be implemented for face biometric encryption and decryption. Code Scan QR to carry out transaction permits with Face verification transaction permits gets the accuracy rate of 95% for 10 sample people and transaction process gets time speed of 53.21 seconds per transaction with a transaction sample of 100 times.
Bharati, Aparna, Moreira, Daniel, Brogan, Joel, Hale, Patricia, Bowyer, Kevin, Flynn, Patrick, Rocha, Anderson, Scheirer, Walter.  2019.  Beyond Pixels: Image Provenance Analysis Leveraging Metadata. 2019 IEEE Winter Conference on Applications of Computer Vision (WACV). :1692–1702.
Creative works, whether paintings or memes, follow unique journeys that result in their final form. Understanding these journeys, a process known as "provenance analysis," provides rich insights into the use, motivation, and authenticity underlying any given work. The application of this type of study to the expanse of unregulated content on the Internet is what we consider in this paper. Provenance analysis provides a snapshot of the chronology and validity of content as it is uploaded, re-uploaded, and modified over time. Although still in its infancy, automated provenance analysis for online multimedia is already being applied to different types of content. Most current works seek to build provenance graphs based on the shared content between images or videos. This can be a computationally expensive task, especially when considering the vast influx of content that the Internet sees every day. Utilizing non-content-based information, such as timestamps, geotags, and camera IDs can help provide important insights into the path a particular image or video has traveled during its time on the Internet without large computational overhead. This paper tests the scope and applicability of metadata-based inferences for provenance graph construction in two different scenarios: digital image forensics and cultural analytics.
Scherzinger, Stefanie, Seifert, Christin, Wiese, Lena.  2019.  The Best of Both Worlds: Challenges in Linking Provenance and Explainability in Distributed Machine Learning. 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS). :1620–1629.
Machine learning experts prefer to think of their input as a single, homogeneous, and consistent data set. However, when analyzing large volumes of data, the entire data set may not be manageable on a single server, but must be stored on a distributed file system instead. Moreover, with the pressing demand to deliver explainable models, the experts may no longer focus on the machine learning algorithms in isolation, but must take into account the distributed nature of the data stored, as well as the impact of any data pre-processing steps upstream in their data analysis pipeline. In this paper, we make the point that even basic transformations during data preparation can impact the model learned, and that this is exacerbated in a distributed setting. We then sketch our vision of end-to-end explainability of the model learned, taking the pre-processing into account. In particular, we point out the potentials of linking the contributions of research on data provenance with the efforts on explainability in machine learning. In doing so, we highlight pitfalls we may experience in a distributed system on the way to generating more holistic explanations for our machine learning models.