Dynamic Data Citation Service-Subset Tool for Operational Data Management

Chris Schubert, Georg Seyerl, Katharina Sack

Publication: Scientific journalJournal articleResearchpeer-review

Abstract

In earth observation and climatological sciences, data and their data services grow on a daily
basis in a large spatial extent due to the high coverage rate of satellite sensors, model calculations, but
also by continuous meteorological in situ observations. In order to reuse such data, especially data
fragments as well as their data services in a collaborative and reproducible manner by citing the origin
source, data analysts, e.g., researchers or impact modelers, need a possibility to identify the exact
version, precise time information, parameter, and names of the dataset used. A manual process would
make the citation of data fragments as a subset of an entire dataset rather complex and imprecise to
obtain. Data in climate research are in most cases multidimensional, structured grid data that can
change partially over time. The citation of such evolving content requires the approach of "dynamic
data citation". The applied approach is based on associating queries with persistent identifiers. These
queries contain the subsetting parameters, e.g., the spatial coordinates of the desired study area or the
time frame with a start and end date, which are automatically included in the metadata of the newly
generated subset and thus represent the information about the data history, the data provenance,
which has to be established in data repository ecosystems. The Research Data Alliance Data Citation
Working Group (RDA Data Citation WG) summarized the scientific status quo as well as the state of
the art from existing citation and data management concepts and developed the scalable dynamic
data citation methodology of evolving data. The Data Centre at the Climate Change Centre Austria
(CCCA) has implemented the given recommendations and offers since 2017 an operational service
on dynamic data citation on climate scenario data. With the consciousness that the objective of this
topic brings a lot of dependencies on bibliographic citation research which is still under discussion,
the CCCA service on Dynamic Data Citation focused on the climate domain specific issues, like
characteristics of data, formats, software environment, and usage behavior. The current effort beyond
spreading made experiences will be the scalability of the implementation, e.g., towards the potential
of an Open Data Cube solution.
Original languageGerman (Austria)
Pages (from-to)1-12
JournalData
Volume4
Issue number3
DOIs
Publication statusPublished - 2019
Externally publishedYes

Cite this