Workshop

Measuring AI-Readiness: A Three-Axis Maturity Model for Agent-Optimized Codebases

Friday, May 29

11:00 - 13:00
RoomPiadina
LanguageEnglish
Audience levelIntermediate
Seats left12
    Elevator pitch

    Your AI coding agent burns tokens, takes wrong turns, and breaks things. Not because the model is weak, but because your repository was written for humans, not for agents. In this 2-hour workshop you won’t watch slides about a framework. You’ll bring your own codebase and walk out with a measured AI-Readiness score, a ranked list of fixes, and the scripts to repeat the measurement in CI.

    Abstract

    AI agents fail inside real repositories for reasons that static lint rules and architecture diagrams don’t capture: context layouts that waste tokens, structures that hide intent, validation loops that never close. “AI-Readiness” is a measurable property of a codebase, and like any measurable property you can audit it, improve it, and track it over time.

    This workshop is not a tour of a pre-packaged scoring framework. It is a working session in which you learn a repeatable four-stage process and apply it, live, to a repository you actually care about.

    The process you will learn

    You will work on your own repository through four stages:

    Scan

    Run a diagnostic on your codebase across a set of weighted dimensions (agent instructions, project navigability, testing and validation, CI/CD, spec-driven workflow, skills and tooling, documentation, agent-specific configuration).

    Output: a 0-100 baseline score and a per-dimension breakdown that shows exactly where the repo is leaking agent effort.

    Report

    Turn the raw scan into a prioritized roadmap.

    • Which dimensions cost the most tokens?
    • Which fixes have the best ROI per hour of engineering work?
    • What is invisible to an agent today that a small change would surface?

    Fix

    Apply targeted interventions.

    Some are auto-generated:

    • missing CLAUDE.md
    • environment templates
    • CI scaffolding
    • assertion messages

    Others require human judgment:

    • restructuring a spec workflow
    • rewriting an architecture overview

    You will do both, live, on your repo.

    Diff

    Re-run the scan and measure the delta.

    • Quantify the improvement
    • Identify what is left
    • Wire the scan into CI so AI-Readiness becomes a tracked metric, not a one-off audit

    The three-axis model from the original research (efficiency, navigability, verifiability) still anchors the analysis, but you will work with it through instruments rather than slides.

    What you take home

    • A scored AI-Readiness baseline for your repository.
    • A prioritized intervention plan with expected impact per fix.
    • The measurement scripts (open-source) to re-run the audit on demand or in CI.
    • A reusable protocol you can hand to your team or apply to other repos.

    Prerequisites

    • Laptop with Python 3.10+ and Docker.
    • A repository you can run locally, ideally one your team actually uses, even if small.
    • API credits for one supported LLM provider (Anthropic, OpenAI, GLM, MiniMax, or Kimi).
    • Comfort with the command line and reading Python.

    Why a process, not a framework

    Pre-packaged scores tell you that something is wrong. A process tells you how to find out what is wrong in code you didn’t write, which is your own. By the end of the workshop you should be able to onboard a new repository to AI-Readiness measurement without me in the room.

    TagsML and AI, Code Analysis and Typing
    Participant

    Stefano Maestri

    With over 25 years of experience in enterprise software development and AI engineering, I explore how artificial intelligence is transforming the way we build, learn, and live with technology. I have an old passion for Open Source and more recent one for AI Engineering.

    ✍️ I write two newsletters (Italian and English). 🎧 I co-host a podcast (Italian only) about AI, agents, and the future. 💻 Mantainer of the official A2A specification implementation and TCK under the Linux Foundation.