Main Conference Sessions
End-to-End e-Science: Integrating Workﬂow, Query, Visualization, and Provenance at an Ocean Observatory
- Bill Howe, Oregon Health and Science University
- Peter Lawson, National Marine Fisheries Service
- Renee Bellinger, Oregon State University
- Erik Anderson, University of Utah
- Emanuele Santos, University of Utah
- Juliana Freire, University of Utah
- Carlos Scheidegger, University of Utah
- Antonio Baptista, Oregon Health and Science University
- Claudio Silva, University of Utah
Computing has been an enormous accelerator to science and has led to an information explosion in many different fields. To analyze and understand scientific data, complex computational processes must be assembled, often requiring the combination of loosely-coupled resources, specialized libraries, and Grid and Web services. The heterogeneity of data sources, analysis techniques, data products, and user communities make it difficult to design a system that is flexible enough to accommodate broad requirements but specialized enough to be of daily use to scientists, policy makers, students, and the general public. Databases, workflow systems, and visualization tools each offer useful features but are individually incomplete solutions.
Databases provide algebraic optimization and physical data independence, but offer poor support for complex data types (meshes, multidimensional arrays) and are change-intolerant. Workflow systems are flexible, but even skilled programmers have trouble operating them effectively. Visualization tools overwhelm novice users with an enormous parameter space and are typically optimized for "throwing" datasets through the graphics pipeline rather than general data manipulation.
In this paper, we argue that typical data analysis tasks at an ocean observatory require techniques from all three tools, sometimes domain-specialized. To support these tasks, we describe a platform developed as part of a collaborative cyberinfrastructure at the NSF Science and Technology Center for Coastal Margin Observation and Prediction (CMOP) integrating a provenance-aware workflow system, 3D visualization capabilities, and a remote query engine for large-scale ocean circulation models, in addition to access routines for local files and Web services. We conclude that data management solutions for e-Science require this kind of holistic approach and recommend a broader, application-oriented research agenda for the e-Science community.
Date and Time
Thursday, December 11, 10 a.m. to 10:30 a.m.