Cartomancer — Documentation

Quick Start

cargo install cartomancer
export GITHUB_TOKEN=ghp_...

# Review a PR (dry run — prints to stdout, doesn't post)
cartomancer review owner/repo 42 --dry-run

# Review and post comments to GitHub
cartomancer review owner/repo 42

# Local scan (no GitHub, no PR)
cartomancer scan ./src

Prerequisites

Opengrep must be installed and in your PATH. Install opengrep →
GITHUB_TOKEN environment variable (for review and serve commands)
Ollama (optional) for local LLM deepening, or ANTHROPIC_API_KEY for Claude

Run cartomancer doctor to verify your setup.

Your First Review

# 1. Install
cargo install cartomancer

# 2. Set your GitHub token
export GITHUB_TOKEN=ghp_your_token_here

# 3. Dry run against a real PR
cartomancer review your-org/your-repo 123 --dry-run

# 4. Check the output — severity, blast radius, escalations
# 5. When satisfied, run without --dry-run to post to GitHub
cartomancer review your-org/your-repo 123

The first run clones the repo to a temp directory, runs opengrep, enriches findings with cartog, and posts categorized comments. Use --work-dir ./repo to reuse an existing checkout.

`cartomancer init [--force]`

Scaffold a commented .cartomancer.toml at the configured config path (defaults to the current directory). Fails if the file already exists; pass --force to overwrite.

cartomancer init                          # writes .cartomancer.toml
cartomancer init --force                   # overwrite existing config

After init, run cartomancer doctor to validate git, opengrep, cartog, and the configured LLM provider are reachable.

`cartomancer scan <path>`

Run opengrep on a local directory. No GitHub, no PR — useful for local development or CI pipelines.

cartomancer scan .                       # scan current directory
cartomancer --json scan src/               # machine-readable output

Findings are enriched with cartog blast radius and persisted to the local store. The text output starts with Scan id: N so you can pipe it into dismiss or findings. The --json envelope is:

{
  "scan_id": 42,
  "findings": [ /* ... */ ],
  "summary": { "total": 3, "critical": 1, "error": 1, "warning": 1, "info": 0 }
}

`cartomancer review <owner/repo> <pr>`

The main command. Runs the full 11-stage pipeline against a GitHub PR.

cartomancer review acme/api 42                    # review and post
cartomancer review acme/api 42 --dry-run           # preview without posting
cartomancer review acme/api 42 --work-dir ./repo   # reuse checkout
cartomancer review acme/api 42 --resume abc123     # resume from last stage
cartomancer --json review acme/api 42              # JSON output

Flags:

--dry-run — output ReviewResult to stdout, skip posting to GitHub
--work-dir <path> — use an existing directory instead of cloning to temp
--resume <scan-id> — resume a previously failed scan from its last completed stage

`cartomancer history`

Browse past scan results stored in the local database. When --json is set, empty results emit [] (valid JSON for shell pipelines); the same applies to findings and dismissed.

cartomancer history                       # all scans
cartomancer history --branch main          # filter by branch
cartomancer --json history                 # JSON output

`cartomancer findings`

Search and filter findings across scans.

cartomancer findings                                # latest scan
cartomancer findings abc123                          # specific scan
cartomancer findings --rule sql-injection            # by rule pattern
cartomancer findings --severity critical             # by severity
cartomancer findings --file "src/api/*"              # by file pattern
cartomancer findings --branch feature/auth           # by branch

`cartomancer dismiss <scan-id> <index>`

Dismiss a finding as a false positive. Dismissed findings are suppressed by fingerprint in future scans.

cartomancer dismiss abc123 2 --reason "intentional pattern"

Dismissals are fingerprint-based (SHA-256 of rule + file + snippet). If the code changes, the fingerprint changes and the finding reappears.

`cartomancer dismissed`

List all active dismissals.

cartomancer dismissed --json

`cartomancer undismiss <dismissal-id>`

Remove a dismissal, allowing the finding to appear again.

cartomancer undismiss d-abc123

`cartomancer serve`

Start a webhook server that receives GitHub pull_request events and runs the review pipeline automatically.

cartomancer serve                         # default port 3000
cartomancer serve --port 8080             # custom port

See Webhook Server section for GitHub configuration.

`cartomancer doctor`

Validate that all dependencies and configuration are correct. Exit code is non-zero when any check reports an error.

cartomancer doctor                        # human-readable
cartomancer doctor --json                 # structured output

Checks (error level in parentheses):

config (error) — AppConfig::validate() passes
git (error) — git is in PATH (required for review / serve)
opengrep (error) — responds to --version within 10s
custom-rules (warn) — opengrep.rules_dir contains at least one rule file
knowledge (warn) — knowledge file is readable and within max_knowledge_chars
cartog (warn) — cartog --version succeeds
cartog-db (warn) — the file at severity.cartog_db_path exists (run cartog index . if missing)
github-token (warn) — GITHUB_TOKEN env var or github.token is set
llm-provider (warn) — configured provider responds to a health check
storage (error) — SQLite store at storage.db_path can be opened

Configuration

Cartomancer works with zero configuration. Optionally, place a .cartomancer.toml at your project root:

# .cartomancer.toml — full reference

[github]
token_env = "GITHUB_TOKEN"           # env var name (default)
webhook_secret = ""                  # for serve command HMAC validation

[opengrep]
rules_dir = ".cartomancer/rules"     # custom YAML rules directory
exclude_patterns = ["test/**", "vendor/**"]
enclosing_context = true             # include function body in LLM prompt
taint_intrafile = true               # cross-function taint analysis
dynamic_timeout = 30                 # per-file timeout in seconds

[llm]
provider = "ollama"                  # "ollama" or "anthropic"
deepening_threshold = "error"        # minimum severity for LLM

[llm.ollama]
base_url = "http://localhost:11434"
model = "llama3.2"

[llm.anthropic]
model = "claude-sonnet-4-20250514"
max_tokens = 4096                    # 1..=128,000

[graph]
blast_radius_threshold = 5           # escalation threshold

[knowledge]
file = ".cartomancer/knowledge.md"   # company context for LLM
system_prompt = "You are reviewing a fintech codebase."
max_chars = 8000                     # truncation limit

[storage]
db_path = ".cartomancer.db"          # SQLite database path

[serve]
max_concurrent_reviews = 4

Opengrep

Cartomancer invokes opengrep as a subprocess. All opengrep-specific features are opt-in:

--taint-intrafile — cross-function taint tracking within a file
--output-enclosing-context — include enclosing function/class body
--dynamic-timeout N — per-file analysis timeout
--baseline-commit <sha> — only report findings new since the base branch
--exclude — skip files matching patterns

Custom rules in .cartomancer/rules/ are auto-discovered and passed to opengrep alongside built-in rules.

LLM Providers

Provider	Config	Use case
Ollama	`[llm] provider = "ollama"`	Local development, air-gapped, free
Anthropic	`[llm] provider = "anthropic"`	Production, higher quality analysis. Requires `ANTHROPIC_API_KEY`

LLM deepening is conditional: only triggers when finding severity ≥ deepening_threshold AND blast radius > 3 (or always_deepen = true on the rule).

Knowledge Base

Place a markdown file at .cartomancer/knowledge.md (or configure the path). Its content is injected into every LLM prompt as a ## Company Context section.

# .cartomancer/knowledge.md
## Architecture
- Monolith Rails app migrating to Rust microservices
- Auth service handles OAuth2 + SAML
- Payment service wraps Stripe API

## Security Policies
- All user input must be sanitized at controller level
- SQL queries must use parameterized statements
- PII must not appear in logs

Security: path validated against traversal, binary files rejected, content truncated at max_chars.

Per-Rule Overrides

Override severity bounds and LLM behavior per rule ID:

[knowledge.rules.sql-injection]
min_severity = "error"              # floor before escalation
max_severity = "critical"           # ceiling after escalation
always_deepen = true                # always run LLM, skip gates

[knowledge.rules.unused-import]
max_severity = "info"               # cap at info, never escalate

Storage

All scans and findings are persisted to a SQLite database (.cartomancer.db by default). The schema has 3 tables:

scans — scan metadata, pipeline stage, error messages
findings — per-finding data with fingerprint, severity, suggested fix, agent prompt
dismissals — false positive suppression by fingerprint

Persistence is best-effort: if the database is unavailable, the pipeline continues and logs a warning.

Pipeline Stages

1. Resolve GitHub token (env or config)
2. Fetch PR metadata (head SHA, base SHA)
3. Prepare work dir (clone or reuse --work-dir)
4. Fetch + parse unified diff
5. Opengrep scan (--baseline-commit, --exclude, custom rules)
6. Enrich with cartog (impact, refs, callers, domain detection)
7. Escalate severity (blast radius + domain + per-rule overrides)
8. LLM deepen (conditional: severity + blast_radius gates)
9. Regression check (fingerprint comparison with base branch)
10. Dismiss filter (remove dismissed findings)
11. Persist scan + Post review (or --dry-run output)

Stages 4-7 persist findings to the store after each stage for resumability. Use --resume <scan-id> to restart from the last completed stage.

Escalation Matrix

Base Severity	Blast Radius	Domain	Callers	Final Severity
any	≥ threshold×4	any	any	Critical
any	≥ threshold	any	any	Error (minimum)
any	any	auth	any	Critical
any	any	payment	any	Critical
any	any	any	≥ 10	Error (minimum)
any	< threshold	none	< 10	unchanged

Default blast_radius_threshold = 5 (configurable).

LLM Deepening

Triggered when: severity ≥ deepening_threshold AND blast_radius > 3, or always_deepen = true.

The LLM prompt includes:

Finding details (rule, message, severity, file, code snippet)
Enclosing function/class body (truncated to 2000 chars)
Structural context from cartog (symbol name, blast radius, callers, domain tags)
Company knowledge from .cartomancer/knowledge.md
Custom system prompt from config

Response parsing extracts:

Analysis — 2-3 sentence explanation of real-world impact
Suggested fix — unified diff extracted from ```diff fence
Agent prompt — self-contained prompt for AI agent remediation

Regression Detection

Each finding is fingerprinted with SHA-256 of rule_id:file_path:snippet_content. During review, fingerprints are compared against the latest scan on the base branch:

New finding (is_new: true) — not in base branch baseline
Existing finding (is_new: false) — already present before this PR

Line numbers are excluded from the fingerprint because they shift with unrelated edits.

Comment Formatting

Cartomancer posts three types of GitHub comments:

Inline comments — on diff lines. Severity badge, category (Actionable/Nitpick), blast radius, LLM analysis, collapsible fix + agent prompt
Off-diff comments — for findings outside the visible diff. Wrapped in a [!CAUTION] banner
Summary comment — actionable count, severity breakdown, top escalated findings, scan metadata

Classification: Actionable = has suggested fix OR severity ≥ Error. Otherwise Nitpick.

Webhook Server Setup

cartomancer serve --port 3000

Endpoints:

POST /webhook — receives GitHub pull_request events
GET /health — health check (returns 200)

GitHub Configuration

In your GitHub repository settings:

Payload URL: https://your-server:3000/webhook
Content type: application/json
Secret: set in .cartomancer.toml as github.webhook_secret
Events: select Pull requests

Concurrency

Bounded by serve.max_concurrent_reviews (default: 4) via tokio Semaphore. Each review runs in a background task with its own temp dir, GitHub client, and store connection. Graceful shutdown on SIGTERM/SIGINT.

Architecture

cartomancer-server (binary)
  |-- cartomancer-core     pure domain types (Finding, Severity, config)
  |-- cartomancer-graph    cartog enricher + severity escalator
  |-- cartomancer-github   GitHub API client + webhook types
  |-- cartomancer-store    SQLite persistence (scans, findings, dismissals)

Error handling: anyhow throughout the workspace. Per-finding errors are logged and skipped (partial results are better than no results).

Key Types

Type	Crate	Purpose
`Finding`	core	Opengrep finding + graph context + LLM analysis + fix + agent prompt
`GraphContext`	core	Blast radius, callers, domain tags from cartog
`Severity`	core	Info < Warning < Error < Critical
`ReviewResult`	core	Final output posted to GitHub
`PipelineStage`	core	Pending → Prepared → Scanned → Enriched → Escalated → Deepened → Completed / Failed
`AppConfig`	core	Deserialized from `.cartomancer.toml`
`LlmProvider`	server	Async trait: Ollama and Anthropic implementations
`CartogEnricher`	graph	Wraps `cartog::db::Database`; single `enrich_batch_optimized` method deduplicates queries per file and symbol
`SeverityEscalator`	graph	Blast radius + domain → severity upgrade
`Store`	store	SQLite persistence: CRUD, dismissals, baselines
`PrMetadata`	github	PR head/base SHA, refs, title

Environment Variables

Variable	Required	Purpose
`GITHUB_TOKEN`	for review/serve	GitHub API authentication
`ANTHROPIC_API_KEY`	for Anthropic LLM	Claude API authentication
`RUST_LOG`	no	Log level (e.g. `cartomancer=debug`)