Visible to the public Provenance Support for Biomedical Big Data Analytics

TitleProvenance Support for Biomedical Big Data Analytics
Publication TypeConference Paper
Year of Publication2016
AuthorsMcClatchey, Richard, Branson, Andrew, Shamdasani, Jetendr
Conference NameProceedings of the 20th International Database Engineering & Applications Symposium
PublisherACM
Conference LocationNew York, NY, USA
ISBN Number978-1-4503-4118-9
KeywordsBig Data, composability, Description-driven systems, Human Behavior, medical analytics, Metrics, Provenance, provenance data, pubcrawl, Resiliency, Traceability
Abstract

One essential requirement for supporting analytics for Big Medical Data systems is the provision of a suitable level of traceability to data or processes ('Items') in large volumes of data. Systems should be designed from the outset to support usage of such Items across the spectrum of medical use and over time in order to promote traceability, to simplify maintenance and to assist analytics. The philosophy proposed in this paper is to design medical data systems using a 'description-driven' approach in which meta-data and the description of medical items are saved alongside the data, simplifying item re-use over time and thereby enabling the traceability of these items over time and their use in analytics. Details are given of a big data system in neuroimaging to demonstrate aspects of provenance data capture, collaborative analysis and longitudinal information traceability. Evidence is presented that the description-driven approach leads to simplicity of design and ease of maintenance following the adoption of a unified approach to Item management.

URLhttp://doi.acm.org/10.1145/2938503.2938540
DOI10.1145/2938503.2938540
Citation Keymcclatchey_provenance_2016