Git Metrics at Scale with Azure DevOps, S3 and Grafana

Git for Development and Operations

Git Metrics at Scale.

SECURITY
30 JAN 2026
blog

Git Metrics as Compliance Evidence, Not Developer Theatre

Everyone wants “engineering metrics” until they actually need them. Then it turns into a week of screenshot archaeology, arguments about what Azure DevOps is showing, and a sad spreadsheet that gets emailed around like it is the truth. It is not the truth. It is a story people tell when they do not have an auditable data source.

We hit that point recently. We needed a way to pull management and compliance signals out of Git across our Azure DevOps estate. Not one repo. Not one project. All of it. And we wanted it to land in the same place as everything else we treat as compliance evidence, in our observability S3, so it can be unpacked into our data lake and reported in Grafana.

This post is the how and the why, because this is the kind of plumbing that nobody celebrates, but everyone depends on.

What problem are we solving

A plain Git repo contains plenty of data that matters: commit history, change size, file churn, hotspots, merge behaviour, revert patterns. This stuff tells you where your risk is, where your maintenance costs live, and whether your delivery is stable or chaotic. It also gives you evidence for change control in environments where audit questions are not friendly.

Azure DevOps can show bits of this, per repo, per project, in a way that looks nice until you try to standardise it across the organisation. Then it becomes inconsistent, hard to compare, and hard to export in a form that can be treated as a source of truth. We wanted something repeatable, centralised, and boring. Boring is good. Boring survives audits.

The pattern we used

We stuck to our house rule for DevOps: keep YAML dumb. No logic in the pipeline. No half-baked bash embedded in YAML. The pipeline calls a template, the template calls scripts, and the scripts do the work. That separation matters because YAML is terrible for logic, terrible for testing, and it encourages quick hacks that become permanent.

At the top level we now have a single pipeline that accepts a list of Azure DevOps projects. That list is the scope of the scan. The pipeline runs on a schedule or on demand, and it produces a single evidence bundle for the run.

How it works, end to end

Step 1 is scope. We specify the projects we want to scan in the pipeline parameters. This matters because in real organisations you do not have one project, you have many. Some are legacy, some are product, some are platform, some exist because someone clicked the wrong thing in 2021. We want all of them.


# Pipeline parameter example
projects: "AppGenie FTP2SF Project, Core Platform, Internal Tools"

Step 2 is repo discovery. The script uses the Azure DevOps REST API to enumerate repos in each project. This includes pagination handling because Azure DevOps will happily hand you partial results and pretend that is complete. We capture the repo list as an artifact as well, because the list of repos scanned is part of the evidence.

Step 3 is cloning. For each repo we clone using the pipeline OAuth token. This is important: the pipeline identity needs read access, and you must enable “Allow scripts to access OAuth token”. Without that, you have a pipeline that looks like it ran but collected nothing useful.

Step 4 is collection. For each cloned repo we run a set of Git-only metric queries. No PR data, no work items, no external systems. This is deliberate. If all you have is Git, you still get a lot of truth.

We generate outputs that are designed for downstream analytics: repo-metrics.json for a summary, commit-metrics.csv for time series and distribution analysis, and hotspots.csv for “what files keep changing”. If you want to add more later, you add it in the script, not in YAML, that way you follow Marks coding mantra - write code once, write it right and reuse.

Step 5 is bundling. The orchestrator script writes everything into a run folder with a stable structure that includes the project name and repo name. That structure makes it easy for the downstream Lambda to unpack and load into the lake without having to guess what anything is.


out/git-metrics/<RUN_ID>/
  summary.json
  repo-list.json
  projects/<PROJECT>/
    project.json
    repos/<REPO>/
      repo-metrics.json
      commit-metrics.csv
      hotspots.csv

Step 6 is evidence ingest. Instead of inventing a new pipeline just for metrics, we reuse the compliance ingest plumbing we have for everything else compliancy, policy genertaion, evidence recoding, everything gets logged into the same location so the pipeline uploads the bundle into our observability S3 using the same approach. That means we get consistent metadata, consistent access patterns, and a consistent audit trail.

Step 7 is reporting. The pipeline output is not the end goal. The end goal is a repeatable data feed that lands in the lake, where Grafana can read it and visualise it. We are not trying to build a dashboard inside Azure DevOps. We are treating Azure DevOps as a data source and S3 as our system of record.

What metrics we get from stock Git

This is where people get excited and then immediately make a mess. The goal is not to measure humans. The goal is to measure flow, risk, and stability. From Git alone we can reliably compute things like commit volume and cadence, active contributor counts, change size distribution, hotspots by churn, and merge behaviour. This is enough to answer real questions.

Which repos are high churn and therefore high operational cost? Which directories are hotspots and therefore the best targets for refactoring or test investment? Which repos show unusual merge behaviour that matches the kind of PR bloat that causes the “Azure DevOps is lying” arguments? Which weeks show big spikes in change size that correlate with release risk?

None of this requires a new tool. It requires discipline and a pipeline that runs consistently.

Why this matters for compliance

In compliance conversations, “we have a process” is not evidence. Evidence is timestamped, repeatable, and independently verifiable. A Git metrics bundle stored in S3 with a known structure and a known ingest path is evidence. It shows change activity, it shows where change concentrates, and it shows how your engineering system behaves over time. That supports change control narratives and it supports continuous monitoring narratives, without requiring people to manually curate stories.

It also gives you a baseline. Once you have a baseline, you can detect drift. Drift is what bites teams: branching models that slowly degrade, repos that quietly become unmaintainable, and “quick fixes” that turn into permanent operational risk.

Closing

What we built is not complicated. That is the point. A simple pipeline kicks off a template. The template calls scripts. The scripts scan every repo in every specified Azure DevOps project, generate consistent outputs, and push the bundle into observability S3 through our compliance ingest. A Lambda unpacks it to the data lake. Grafana reports the truth.

The result is a management and compliance view of engineering that does not depend on opinions, screenshots, or whoever yells loudest in a standup. It is automated, repeatable, and boring. Boring is what you want when someone asks “prove it”.