Using a Computational Study of Hydrodynamics in the Wax Lake Delta to Examine Data Sharing Principles

  • Qian Zhang University of Illinois at Urbana-Champaign
  • Heidi Imker University of Illinois at Urbana-Champaign
  • Chunyan Li Louisiana State University
  • Bertram Ludäscher University of Illinois at Urbana-Champaign
  • Megan Senseney University of Illinois at Urbana-Champaign

Abstract

In this paper we describe a complex dataset used to study the circulation and wind-driven flows in the Wax Lake Delta, Louisiana, USA under winter storm conditions. The whole package bundles a large dataset (approximately 74 GB), which includes the numerical model, software and scripts for data analysis and visualization, as well as detailed documentation. The raw data came from multiple external sources, including government agencies, community repositories, and deployed field instruments and surveys. Each raw dataset goes through the processes of data QA/QC, data analysis, visualization, and interpretation. After integrating multiple datasets, new data products are obtained which are then used with the numerical model. The numerical model undergoes model verification, testing, calibration, and optimization. With a complex algorithm of computation, the model generates a structured output dataset, which is, after post-data analysis, presented as informative scientific figures and tables that allow interpretations and conclusions contributing to the science of coastal physical oceanography.

Performing this study required a tremendous amount of effort. While the work resulted in traditional dissemination via a thesis, journal articles and conference proceedings, more can be gained. The data can be reused to study reproducibility or as preliminary investigation to explore a new topic. With thorough documentation and well-organized data, both the input and output dataset should be ready for sharing in a domain or institutional repository. Furthermore, the data organization and documentation also serves as a guideline for future research data management and the development of workflow protocols. Here we will describe the dataset created by this study, how sharing the dataset publicly could enable validation of the current study and extension by new studies, and the challenges that arise prior to sharing the dataset.

Author Biography

Qian Zhang, University of Illinois at Urbana-Champaign
School of Information Sciences
Published
04-Jul-2017
Section
Articles