CAESAR-A Collaborative Environment for Scientific Analysis with Reproducibility
We present CAESAR, a framework for the end-to-end provenance management of scientific experiments.
This collaborative framework allows scientists to capture, manage, query and visualize the complete path
of a scientific experiment
consisting of computational and non-computational steps in an interoperable way.
We focus on the provenance management of experiments in life-sciences particularly
concerned with imaging datasets.
CAESAR is built on top of OMERO.
OMERO provides the scientific data management of imaging datasets.
CAESAR is extended to include the provenance management of scientific experiments.
The following diagram shows an overview of CAESAR and its modules to provide end-to-end provenance
management .
- Provenance Capture
- Provenance Represent
- The REPRODUCE-ME Data Model
- Provenance Storage
- Provenance Query
- Provenance Compare
- The Jupyter nbextension of ProvBook
- The complete package of ProvBook with provenance capture, representation and difference.
- Provenance Visualize
The provenance of scientific experiments is captured using the metadata editor which is provided with rich facilities. The metadata extracted from the images is also linked to the experimental data provided by scientists. This module captures the experiment, its steps, plans, activites, instruments used and their settings, the agents involved, the materials used, etc.
Resources
The provenance of scientific experiments is represented with the help of semantic web technologies. This is done using the REPRODUCE-ME ontology. It links the computational and non-computational steps and data of scientific experiments.
Resources
The provenance of scientific experiments is stored in PostgreSQL database. We provide an ontology-based data access of the databases. We provide the federation store of multiple databases in CAESAR and the mapping of the underlying databases to the ontology which provide a semantic approach to query the data using the rdf4j SPARQL Endpoint.
Resources
The provenance of scientific experiments is queried using SQL or SPARQL.
Resources
The provenance difference of scientific experiments is provided by ProvBook and the version history of experiments is provided by CAESAR.
Resources
The provenance of scientific experiments is visualized using the Project Dashboard and ProvTrack modules. The Project Dashboard provides an overall view of the experiments conducted for a research project. The ProvTrack provides a complete path of the provenance of scientific experiments.
The following diagram shows an overview of the ProvTrack to represent a complete path of a scientific experiment.
Resources
Source Code
Materials and Results for Evaluation
For more information:
-
The Story of an Experiment:A Provenance-based Semantic Approach towards Research Reproducibility, Sheeba Samuel, Kathrin Groeneveld, Frank Taubert, Daniel Walther, Tom Kache, Teresa Langenstück, Birgitta König-Ries, H. Martin Bücker and Christoph Biskup, The 11th International Conference on Semantic Web Applications and Tools for Health Care and Life Sciences (SWAT4HCLS 2018), 3-6 December, 2018, antwerp, Belgium (Preprint, Slides, Link)