Mismatch Auditor

Mismatch Auditor is a security plugin I built for Claude Code (Anthropic's official CLI assistant). It intercepts every bash command an AI agent wants to run, sends it to an external LLM along with the agent's stated intent, and blocks execution when the command diverges from what the agent claimed it would do.

Example

●Agent says: "I'll read package.json"
●Agent runs: rm -rf node_modules
●Plugin: BLOCKED (score = 0.92)

How it works

●Hooks into Claude Code's PreToolUse event — fires before every shell command
●Parses transcript.jsonl to extract the agent's reasoning
●Three-tier filter: static denylist (sudo, curl | bash, rm -rf /) → allowlist (ls, git status, pwd) → SHA-256 LRU cache with TTL → LLM mismatch score
●Blocks when score exceeds configurable threshold

Technical highlights

●Zero production dependencies — pure Node.js built-ins (fs, crypto, os, global fetch)
●Multi-provider LLM layer with priority rotation and file-based token-bucket rate limiting (Groq, OpenRouter, any OpenAI-compatible endpoint — Ollama, LM Studio, vLLM)
●Prompt injection defense: system/user role separation, block markers, explicit instructions to ignore embedded commands
●Fail-open strategy — on any internal failure (API down, rate limit, parse error) the command is allowed. The plugin must never break the user's workflow.
●Three-tier performance: allowlist (instant), cache (<20ms), LLM (100–400ms). Most commands never reach the LLM.
●54 unit + integration tests via node:test, 100% pass rate

Distributed as a self-hosted Claude Code plugin marketplace on GitHub — install via /plugin marketplace add + /plugin install, no cloning needed. 1,200 lines of pure ESM JS. MIT license.

Node.jsJavaScript (ESM)Claude Code Plugin APIGroqOpenRouternode:test

GitHub