Visible to the public Biblio

Filters: Keyword is natural sciences computing  [Clear All Filters]
2020-12-11
Correia, A., Fonseca, B., Paredes, H., Schneider, D., Jameel, S..  2019.  Development of a Crowd-Powered System Architecture for Knowledge Discovery in Scientific Domains. 2019 IEEE International Conference on Systems, Man and Cybernetics (SMC). :1372—1377.
A substantial amount of work is often overlooked due to the exponential rate of growth in global scientific output across all disciplines. Current approaches for addressing this issue are usually limited in scope and often restrict the possibility of obtaining multidisciplinary views in practice. To tackle this problem, researchers can now leverage an ecosystem of citizens, volunteers and crowd workers to perform complex tasks that are either difficult for humans and machines to solve alone. Motivated by the idea that human crowds and computer algorithms have complementary strengths, we present an approach where the machine will learn from crowd behavior in an iterative way. This approach is embodied in the architecture of SciCrowd, a crowd-powered human-machine hybrid system designed to improve the analysis and processing of large amounts of publication records. To validate the proposal's feasibility, a prototype was developed and an initial evaluation was conducted to measure its robustness and reliability. We conclude this paper with a set of implications for design.
2020-03-30
Kim, Sejin, Oh, Jisun, Kim, Yoonhee.  2019.  Data Provenance for Experiment Management of Scientific Applications on GPU. 2019 20th Asia-Pacific Network Operations and Management Symposium (APNOMS). :1–4.
Graphics Processing Units (GPUs) are getting popularly utilized for multi-purpose applications in order to enhance highly performed parallelism of computation. As memory virtualization methods in GPU nodes are not efficiently provided to deal with diverse memory usage patterns for these applications, the success of their execution depends on exclusive and limited use of physical memory in GPU environments. Therefore, it is important to predict a pattern change of GPU memory usage during runtime execution of an application. Data provenance extracted from application characteristics, GPU runtime environments, input, and execution patterns from runtime monitoring, is defined for supporting application management to set runtime configuration and predict an experimental result, and utilize resource with co-located applications. In this paper, we define data provenance of an application on GPUs and manage data by profiling the execution of CUDA scientific applications. Data provenance management helps to predict execution patterns of other similar experiments and plan efficient resource configuration.
2019-03-22
Guntupally, K., Devarakonda, R., Kehoe, K..  2018.  Spring Boot Based REST API to Improve Data Quality Report Generation for Big Scientific Data: ARM Data Center Example. 2018 IEEE International Conference on Big Data (Big Data). :5328-5329.

Web application technologies are growing rapidly with continuous innovation and improvements. This paper focuses on the popular Spring Boot [1] java-based framework for building web and enterprise applications and how it provides the flexibility for service-oriented architecture (SOA). One challenge with any Spring-based applications is its level of complexity with configurations. Spring Boot makes it easy to create and deploy stand-alone, production-grade Spring applications with very little Spring configuration. Example, if we consider Spring Model-View-Controller (MVC) framework [2], we need to configure dispatcher servlet, web jars, a view resolver, and component scan among other things. To solve this, Spring Boot provides several Auto Configuration options to setup the application with any needed dependencies. Another challenge is to identify the framework dependencies and associated library versions required to develop a web application. Spring Boot offers simpler dependency management by using a comprehensive, but flexible, framework and the associated libraries in one single dependency, which provides all the Spring related technology that you need for starter projects as compared to CRUD web applications. This framework provides a range of additional features that are common across many projects such as embedded server, security, metrics, health checks, and externalized configuration. Web applications are generally packaged as war and deployed to a web server, but Spring Boot application can be packaged either as war or jar file, which allows to run the application without the need to install and/or configure on the application server. In this paper, we discuss how Atmospheric Radiation Measurement (ARM) Data Center (ADC) at Oak Ridge National Laboratory, is using Spring Boot to create a SOA based REST [4] service API, that bridges the gap between frontend user interfaces and backend database. Using this REST service API, ARM scientists are now able to submit reports via a user form or a command line interface, which captures the same data quality or other important information about ARM data.

2017-12-12
Miller, J. A., Peng, H., Cotterell, M. E..  2017.  Adding Support for Theory in Open Science Big Data. 2017 IEEE World Congress on Services (SERVICES). :71–75.

Open Science Big Data is emerging as an important area of research and software development. Although there are several high quality frameworks for Big Data, additional capabilities are needed for Open Science Big Data. These include data provenance, citable reusable data, data sources providing links to research literature, relationships to other data and theories, transparent analysis/reproducibility, data privacy, new optimizations/advanced algorithms, data curation, data storage and transfer. An important part of science is explanation of results, ideally leading to theory formation. In this paper, we examine means for supporting the use of theory in big data analytics as well as using big data to assist in theory formation. One approach is to fit data in a way that is compatible with some theory, existing or new. Functional Data Analysis allows precise fitting of data as well as penalties for lack of smoothness or even departure from theoretical expectations. This paper discusses principal differential analysis and related techniques for fitting data where, for example, a time-based process is governed by an ordinary differential equation. Automation in theory formation is also considered. Case studies in the fields of computational economics and finance are considered.