Version Everything: From Chaos to Order in Reproducible Python Projects

Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can’t run your code nor reproduce your results. Sound familiar? In this workshop, you’ll learn the practices and tools to leave these problems behind!

Your analysis works perfectly on your laptop. Three weeks later, it breaks on the server. Your colleague can’t run your code nor reproduce your results. The client’s environment throws mysterious errors. Sound familiar?

This hands-on workshop teaches you to build reproducible workflows using a practical approach that addresses real challenges teams face when sharing code, collaborating on research, or deploying data pipelines.

You’ll learn to:

Lock dependencies and manage isolated Python environments
Version control your code and your data
Externalize parameters using configuration files
Containerize your application for consistent deployment
Apply collaboration practices that scale with your team

The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of “works on my machine” syndrome. You’ll gain hands-on experience with modern tools and practices that make Python workflows reproducible, maintainable, and easy to share, all while applying them to simply data science tasks.

Starting with a messy but working data analysis project, we’ll systematically add reproducibility layers through guided coding exercises.

Modules:

Modern Dependency Management (20 min): Creating lock files, managing Python versions
Code & Configuration Versioning (30 min): Git for source code, configuration files for parameters
Data Pipeline Versioning (30 min): DVC setup, pipeline definitions, experiment tracking
Hidden Reproducibility Challenges (10 min): Randomness and human error
Production Deployment (30 min): Containerization, artifact registries, deployment reproducibility

The workshop is ideal for data scientists, researchers, and Python developers with intermediate experience who are tired of “works on my machine” syndrome.

Prerequisites:

A laptop with admin privileges to install tooling
Basic knowledge of Python syntax and the command line

Version Everything: From Chaos to Order in Reproducible Python Projects

Thursday, May 28

14:40 - 16:40

Aris Nivorlis