Visible to the public Adding Support for Theory in Open Science Big Data

TitleAdding Support for Theory in Open Science Big Data
Publication TypeConference Paper
Year of Publication2017
AuthorsMiller, J. A., Peng, H., Cotterell, M. E.
Conference Name2017 IEEE World Congress on Services (SERVICES)
KeywordsAnalytical models, Big Data, Big Data analytics, Biological system modeling, citable reusable data, composability, compositionality, Data analysis, data curation, data fitting, Data models, data privacy, data provenance, data sources, data storage, data transfer, Economic indicators, frameworks, functional data analysis, Human Behavior, human factors, Mathematical model, Metrics, natural sciences computing, open science big data, predictive analytics, Principal differential analysis, Provenance, pubcrawl, Resiliency, storage management, theory, theory formation
Abstract

Open Science Big Data is emerging as an important area of research and software development. Although there are several high quality frameworks for Big Data, additional capabilities are needed for Open Science Big Data. These include data provenance, citable reusable data, data sources providing links to research literature, relationships to other data and theories, transparent analysis/reproducibility, data privacy, new optimizations/advanced algorithms, data curation, data storage and transfer. An important part of science is explanation of results, ideally leading to theory formation. In this paper, we examine means for supporting the use of theory in big data analytics as well as using big data to assist in theory formation. One approach is to fit data in a way that is compatible with some theory, existing or new. Functional Data Analysis allows precise fitting of data as well as penalties for lack of smoothness or even departure from theoretical expectations. This paper discusses principal differential analysis and related techniques for fitting data where, for example, a time-based process is governed by an ordinary differential equation. Automation in theory formation is also considered. Case studies in the fields of computational economics and finance are considered.

URLhttp://ieeexplore.ieee.org/document/8036724/
DOI10.1109/SERVICES.2017.20
Citation Keymiller_adding_2017