Scrubby

scrubby_index

Index a repository for Scrubby code review. Scans files and git history; in remote mode, indexes via the GitHub App.

Index a repository for Scrubby code review. The first call against a repo does a full scan: file tree, imports, git history, domain discovery, segment analysis, convention extraction. Subsequent calls can be incremental.

Parameters

Name Type Required Description
repo_path string No GitHub repo name in owner/repo format (e.g. ScrubbyAI/api). Required if no repo_id is provided.
incremental boolean No If true, only index changes since the last indexed commit. Default: false.
repo_id number No Existing Scrubby repository ID. If omitted, creates a new repository or finds an existing one by name.

Behavior

When called without an existing index, Scrubby:

  1. Scans the file tree, hashing every file and parsing imports.
  2. Sends directory structure, file metadata, and a sample of file contents to the domain classifier.
  3. Builds the domain graph and computes cross-domain connection weights.
  4. Activates relevant global domains (Ruby, React, Testing, etc.) based on dependency files.
  5. Ingests recent git history (no author data, no PII).
  6. Saves a snapshot recording the head SHA, file count, and whether reclassification was triggered.

When called with incremental: true, only files whose content hash has changed since the last indexed SHA are re-processed. Domain reclassification is skipped unless significant changes are detected (e.g. dependency files changed, more than 20 new files added).

Typical usage

In your AI editor:

"Index this repo with Scrubby. It's owner/repo-name."

For an incremental refresh after large changes:

"Re-index incrementally with Scrubby."

When to re-index

Scrubby tracks new commits incrementally on its own once the first index is built. You should explicitly trigger a re-index when:

  • The repository has been heavily restructured (large directory moves, language additions).
  • You want to force convention re-extraction after a refactor sprint.
  • The dashboard shows the index is stale (warning surfaced after very long inactivity).

Errors

CodeMeaning
repo_not_foundEither the GitHub App isn’t installed on the repo, or the repo name in repo_path is wrong.
not_authenticatedOAuth session expired. Reconnect from your editor.
index_in_progressA previous index is still running. Wait a minute and retry.

See also

Last updated