Research
Our research enhances the reproducibility, explainability, and interoperability of scientific experiments and machine learning models across interdisciplinary domains — including biomedicine, biodiversity, and data science — using provenance, linked data, and knowledge graphs.
Active Research
Computational Reproducibility of Jupyter Notebooks
2021 – present Ongoing
Jupyter notebooks bundle executable code with documentation and output, making them a popular mechanism
for sharing computational workflows. We assess computational reproducibility at scale for notebooks
associated with biomedical publications — mining PubMed Central full texts, locating notebooks on
GitHub, and re-executing them in environments as close to the original as possible. Our study covers
over 27,000 notebooks from 2,660 repositories across two independent runs, identifying key factors
that influence reproducibility and trends in notebook quality over time. Recent work extends this
to automated containerization to close the reproducibility gap.
ReproduceMeON — Ontology Network for Reproducibility
2021 – present Ongoing
ReproduceMeON is an ontology network for the reproducibility of scientific studies, bringing together
foundational and core ontologies that capture different aspects of scientific experiment provenance.
The development uses a semi-automated approach combining ontology matching techniques to select and
develop core ontologies for each sub-domain — including scientific experiments, machine learning,
computational workflows, and microscopy — and links them to existing domain ontologies.
Reproducibility of AI
2021 – present Ongoing
Machine learning is an increasingly important scientific tool, but ML experiments face a reproducibility
crisis similar to other disciplines. This project investigates factors beyond source code and dataset
availability that affect ML reproducibility, proposes ways to apply FAIR data practices to ML
workflows, and develops methodologies for end-to-end reproducibility of ML pipelines — including
custom loss functions for robust neural networks and systematic benchmarks for deep learning
reproducibility in biodiversity research.
Explainability of AI
2021 – present Ongoing
Deep learning models are widely used in scientific domains, but their internal mechanisms remain opaque,
hindering validation and improvement. This project develops interpretability methods that leverage
domain knowledge — particularly ontologies and knowledge graphs — to produce human-understandable
explanations extracted directly from neural networks. Current focus is on plant disease classification
for sustainable agriculture, with extensions to multimodal large language models.
Past Projects
MLProvLab
2021 – 2023 Completed
A JupyterLab extension that automatically tracks, manages, compares, and visualizes the provenance
of machine learning notebooks — identifying relationships between data and models in ML scripts,
tracking metadata including datasets and modules used, and enabling comparison of different
experimental runs.
ReproduceMeGit
2020 – 2022 Completed
A visualization tool for analyzing the reproducibility of Jupyter notebooks on GitHub repositories.
Users can directly assess reproducibility of any repository containing Jupyter notebooks,
viewing counts of successful, exception-throwing, and result-differing notebooks — with
RDF provenance export via ProvBook integration.
ProvBook
2018 – 2019 Completed
A Jupyter Notebook extension that captures and visualizes provenance over time using the
REPRODUCE-ME ontology (extended from PROV-O and P-Plan). Enables sharing notebooks with their
RDF provenance, comparing results across executions, and SPARQL querying of experiment histories.
CAESAR — Collaborative Environment for Scientific Analysis with Reproducibility
2016 – 2019 Completed
An end-to-end provenance management framework for scientific experiments. CAESAR allows scientists
to capture, manage, query, and visualize the complete path of an experiment — covering both
computational and non-computational steps — in an interoperable way.
REPRODUCE-ME Ontology
2016 – 2019 Completed
A generic data model and ontology for representing scientific experiments with full provenance.
The model captures eight experiment components (Data, Agent, Activity, Plan, Step, Setting,
Instrument, Material) and extends PROV-O and P-Plan to enable end-to-end reproducibility
from experiment design through to result.
Reproducibility Survey
2016 – 2019 Completed
An exploratory study surveying researchers across disciplines to understand scientific experiments
and research practices relating to reproducibility. Findings identified a reproducibility crisis
and strong need for sharing data, code, methods, and negative results — with insufficient
metadata and incomplete methods being primary barriers.
Ontology and Corpus Development for Biodiversity
2021 – 2022 Completed
A core ontology (BiodivOnto) for biodiversity linking foundational and domain-specific ontologies,
paired with two gold-standard corpora (BiodivNERE) for Named Entity Recognition and Relation
Extraction generated from biodiversity dataset metadata and publication abstracts.
Acknowledgements: This research is supported in part by the Deutsche Forschungsgemeinschaft (DFG) in Project Z2 of CRC/TRR 166 ReceptorLight, the Carl Zeiss Foundation (K3 project), the Freistaat Thüringen, the Michael Stifel Centre Jena (MSCJ), and the Friedrich Schiller University Jena (IMPULSE funding: IP 2020-10).





