Jump to main content Hotkeys
Distributed and Self-organizing Systems
Distributed and Self-organizing Systems

Masterarbeit

The role of Docker in Computational Reproducibility of Jupyter Notebooks from Scholarly Publications PubMed Central
The role of Docker in Computational Reproducibility of Jupyter Notebooks from Scholarly Publications PubMed Central

Completion

2025/02

Research Area

Web Engineering

Students

Hemanta Lo

Hemanta Lo

student

Advisers

samuel

gaedke

Description

Trustworthy science requires reproducibility, but it is a challenge in computational research. The main reason for this is mainly due to software, operating systems, and dependent differences causing inconsistent results – especially when we are using Jupiter notebooks. However, these notebooks are often used by biomedical researchers to document and share experiments, but are consistently dependent on specific software setups that are difficult for other researchers to reproduce. Others may find it difficult to verify the findings or verify them. At this point, our study proposes the use of Docker to create consistent environments distilling the original research setting and replicating them exactly. Our project introduces a reproducibility pipeline that uses Docker containers to standardize computational environments. Although biomedical publications are used as a test case, the methodology is designed for broader applicability across diverse research domains. By replicating the original results using these Docker environments, we can run the notebooks under control to make sure that they run in a controlled setting. For each of these steps, we detail exactly which techniques we use, as well as any errors or differences from the original experiments, right to the smallest detail in the notebooks. The testing process, conducted on a limited set of five biomedical repositories, demonstrated promising results. The use of Docker played a pivotal role in addressing reproducibility challenges by providing a controlled and isolated environment for execution. Through detailed logging and comprehensive analysis, the approach allowed the identification and resolution of errors with precision.


Powered by DGS
Edit list (authentication required)

Press Articles