Workshop

Version Everything: From Chaos to Order in Reproducible Python Projects

Thursday, May 28

14:40 - 16:40
RoomTigelle
LanguageEnglish
Audience levelBeginner
Elevator pitch

Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can’t run your code nor reproduce your results. Sound familiar? In this workshop, you’ll learn the practices and tools to leave these problems behind!

Abstract

Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can’t run your code nor reproduce your results. The client’s environment throws mysterious errors. Sound familiar?

This hands-on workshop teaches you to build reproducible workflows using a practical approach that addresses real challenges teams face when sharing code, collaborating on research, or deploying data pipelines.

You’ll learn to:

  • Lock dependencies and manage isolated Python environments
  • Version control your code and your data
  • Externalize parameters using configuration files
  • Containerize your application for consistent deployment
  • Apply collaboration practices that scale with your team

The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of “works on my machine” syndrome. You’ll gain hands-on experience with modern tools and practices that make Python workflows reproducible, maintainable, and easy to share, all while applying them to simply data science tasks.

Starting with a messy but working data analysis project, we’ll systematically add reproducibility layers through guided coding exercises.

Modules:

  • Modern Dependency Management (20 min): Creating lock files, managing Python versions
  • Code & Configuration Versioning (30 min): Git for source code, configuration files for parameters
  • Data Pipeline Versioning (30 min): DVC setup, pipeline definitions, experiment tracking
  • Hidden Reproducibility Challenges (10 min): Randomness and human error
  • Production Deployment (30 min): Containerization, artifact registries, deployment reproducibility

The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of “works on my machine” syndrome.

Prerequisites:

  • A laptop with admin privileges to install tooling
  • Basic knowledge of Python syntax and the command line
TagsData Engineering, Scientific Python, Data Science & Data Visualisation
Participant

Aris Nivorlis

Aris Nivorlis is a researcher geophysicist and data steward at Deltares, where he uses data and tooling to answer complex questions about the subsurface. He is passionate about promoting good practices in data management and scientific coding, helping teams build sustainable and reproducible workflows. Outside of work, Aris is actively involved in the European Python community, contributing to the organization and support of conferences and community initiatives. When he’s not at his computer, you’ll likely find him dancing salsa.