Visible to the public Biblio

Filters: Keyword is functional dependencies  [Clear All Filters]
2019-08-26
Cook, W., Driscoll, A., Tenbergen, B..  2018.  AirborneCPS: A Simulator for Functional Dependencies in Cyber Physical Systems: A Traffic Collision Avoidance System Implementation. 2018 4th International Workshop on Requirements Engineering for Self-Adaptive, Collaborative, and Cyber Physical Systems (RESACS). :32-35.

The term "Cyber Physical System" (CPS) has been used in the recent years to describe a system type, which makes use of powerful communication networks to functionally combine systems that were previously thought of as independent. The common theme of CPSs is that through communication, CPSs can make decisions together and achieve common goals. Yet, in contrast to traditional system types such as embedded systems, the functional dependence between CPSs can change dynamically at runtime. Hence, their functional dependence may cause unforeseen runtime behavior, e.g., when a CPS becomes unavailable, but others depend on its correct operation. During development of any individual CPS, this runtime behavior must hence be predicted, and the system must be developed with the appropriate level of robustness. Since at present, research is mainly concerned with the impact of functional dependence in CPS on development, the impact on runtime behavior is mere conjecture. In this paper, we present AirborneCPS, a simulation tool for functionally dependent CPSs which simulates runtime behavior and aids in the identification of undesired functional interaction.

Livshits, Ester, Kimelfeld, Benny, Roy, Sudeepa.  2018.  Computing Optimal Repairs for Functional Dependencies. Proceedings of the 37th ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems. :225-237.

We investigate the complexity of computing an optimal repair of an inconsistent database, in the case where integrity constraints are Functional Dependencies (FDs). We focus on two types of repairs: an optimal subset repair (optimal S-repair) that is obtained by a minimum number of tuple deletions, and an optimal update repair (optimal U-repair) that is obtained by a minimum number of value (cell) updates. For computing an optimal S-repair, we present a polynomial-time algorithm that succeeds on certain sets of FDs and fails on others. We prove the following about the algorithm. When it succeeds, it can also incorporate weighted tuples and duplicate tuples. When it fails, the problem is NP-hard, and in fact, APX-complete (hence, cannot be approximated better than some constant). Thus, we establish a dichotomy in the complexity of computing an optimal S-repair. We present general analysis techniques for the complexity of computing an optimal U-repair, some based on the dichotomy for S-repairs. We also draw a connection to a past dichotomy in the complexity of finding a "most probable database" that satisfies a set of FDs with a single attribute on the left hand side; the case of general FDs was left open, and we show how our dichotomy provides the missing generalization and thereby settles the open problem.

2017-08-18
DiScala, Michael, Abadi, Daniel J..  2016.  Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data. Proceedings of the 2016 International Conference on Management of Data. :295–310.

Self-describing key-value data formats such as JSON are becoming increasingly popular as application developers choose to avoid the rigidity imposed by the relational model. Database systems designed for these self-describing formats, such as MongoDB, encourage users to use denormalized, heavily nested data models so that relationships across records and other schema information need not be predefined or standardized. Such data models contribute to long-term development complexity, as their lack of explicit entity and relationship tracking burdens new developers unfamiliar with the dataset. Furthermore, the large amount of data repetition present in such data layouts can introduce update anomalies and poor scan performance, which reduce both the quality and performance of analytics over the data. In this paper we present an algorithm that automatically transforms the denormalized, nested data commonly found in NoSQL systems into traditional relational data that can be stored in a standard RDBMS. This process includes a schema generation algorithm that discovers relationships across the attributes of the denormalized datasets in order to organize those attributes into relational tables. It further includes a matching algorithm that discovers sets of attributes that represent overlapping entities and merges those sets together. These algorithms reduce data repetition, allow the use of data analysis tools targeted at relational data, accelerate scan-intensive algorithms over the data, and help users gain a semantic understanding of complex, nested datasets.

2017-03-07
DiScala, Michael, Abadi, Daniel J..  2016.  Automatic Generation of Normalized Relational Schemas from Nested Key-Value Data. Proceedings of the 2016 International Conference on Management of Data. :295–310.

Self-describing key-value data formats such as JSON are becoming increasingly popular as application developers choose to avoid the rigidity imposed by the relational model. Database systems designed for these self-describing formats, such as MongoDB, encourage users to use denormalized, heavily nested data models so that relationships across records and other schema information need not be predefined or standardized. Such data models contribute to long-term development complexity, as their lack of explicit entity and relationship tracking burdens new developers unfamiliar with the dataset. Furthermore, the large amount of data repetition present in such data layouts can introduce update anomalies and poor scan performance, which reduce both the quality and performance of analytics over the data. In this paper we present an algorithm that automatically transforms the denormalized, nested data commonly found in NoSQL systems into traditional relational data that can be stored in a standard RDBMS. This process includes a schema generation algorithm that discovers relationships across the attributes of the denormalized datasets in order to organize those attributes into relational tables. It further includes a matching algorithm that discovers sets of attributes that represent overlapping entities and merges those sets together. These algorithms reduce data repetition, allow the use of data analysis tools targeted at relational data, accelerate scan-intensive algorithms over the data, and help users gain a semantic understanding of complex, nested datasets.