Co-Change Analysis
Scrubby learns which files change together from your git history and flags incomplete changesets before they become bugs.
Every codebase has files that should change together. When you update a database model, you probably need a migration. When you change an API endpoint, the client code likely needs updating too. These relationships aren’t enforced by the compiler — they live in your team’s collective knowledge.
When someone forgets to update a related file, the result is a bug that passes code review, passes CI, and breaks in production.
Co-change analysis is Scrubby’s defense against that class of bug.
How it works
Scrubby analyzes your git commit history to build a co-change model:
- History ingestion. Scrubby reads your commit history, tracking which files changed in each commit. No author data and no PII gets stored — only what changed and when.
- Pattern detection. Files that change together in more than 50% of commits are flagged as co-change pairs. This threshold is high enough to filter coincidental co-changes.
- Gap detection. When a PR changes one file but not its co-change partner, Scrubby flags the gap.
The 50% threshold is global to changeset analysis — it’s the same value used by both the GitHub App’s PR review and the MCP scrubby_review_changeset tool.
Where you see it
PR reviews
Co-change gaps are the first thing Scrubby checks in every PR review. Findings appear as a list of “files you may have forgotten” in the PR comment. Common catches:
- Model change without migration.
- API endpoint change without client update.
- Code change without test update.
- Controller change without route update.
Changeset review (MCP)
Before committing, you can run scrubby_review_changeset on your changed files. It returns co-change gaps, consistency warnings, and domain crossings — giving you a chance to catch issues before pushing.
In your AI editor:
"Run scrubby_review_changeset on my staged files."
Accuracy and history depth
Co-change detection improves with more git history. Repositories with hundreds of commits have stronger patterns than newly created repos. Scrubby ingests recent history on the first index and continues to absorb new commits incrementally as they land.
The 50% threshold prevents false positives from one-time coincidental changes. It also means new pairs take real evidence to surface — a single co-change won’t flip the relationship.
Limitations
- No history, no co-change. A brand-new repo with three commits doesn’t have enough signal to form pairs. Scrubby will still review for conventions and domain crossings, but the co-change layer needs evidence.
- File renames. A file rename without using
git mv(i.e. delete + add) breaks the chain of history. Scrubby tries to detect renames via similarity but it’s heuristic; very large renames may need a re-index after the dust settles. - Force-pushed history. Squashing or rewriting history will reset the co-change view of the affected commits on the next index. The model rebuilds; nothing is lost permanently.
Last updated