AGENTS.md

What this is

auto-reverse is a conversational CLI where an LLM agent and a human collaborate to reverse-engineer a website's API. The agent drives a headed browser; an embedded mitmproxy captures all traffic; endpoints are documented into OpenAPI + markdown in real time.

The critical design point: the agent and user work together in a loop. The agent doesn't run autonomously — it pursues intent, reports findings, and asks the user what to do next. The user steers, redirects, or takes over the browser at any time.

Agent-user interaction model

The REPL loop in repl.py is the interface. Non-/ lines go to the Claude agent (agent.py). The agent has three tool groups:

browser_* (navigate, click, type, snapshot) — act on the page, return compact snapshots (url + title + up to 40 interactive elements, not raw HTML)
flows_search — query the FlowStore for captured endpoints
doc_document — enrich an endpoint with summary/description/tag, triggers spec rewrite

Each agent turn runs up to 25 tool iterations (MAX_ITERATIONS in agent.py). The agent should: pursue intent → inspect traffic → enrich endpoints → summarize → ask user what next.

The system prompt is in cli.py:27-34. It's intentionally brief — the tools shape behavior more than the prompt.

Commands

uv sync                                  # install deps
uv run playwright install chromium       # one-time browser install
uv run pytest -q                         # run all tests
uv run pytest tests/test_store.py -q     # run one test file
uv run pytest -k test_name -q            # run one test
uv run auto-reverse https://example.com  # run the tool

Lint/typecheck (from nix shell or install separately):

ruff check src/
pyright src/

Architecture (read before editing)

Single Python process, four concurrent roles:

Thread	Role	Key files
Main	REPL + Claude tool-use loop	`repl.py`, `agent.py`
Main (startup)	CLI arg parsing, wiring	`cli.py`, `config.py`
mitmproxy thread	Embedded DumpMaster, capture addon	`proxy.py`
Doc Worker thread(s)	Schema inference + LLM enrichment on new signatures	`doc/engine.py`, `doc/schema.py`
Node subprocess	Playwright browser	`browser.py`

Data flow: user intent → agent acts via browser → proxy captures → FlowStore deduplicates by signature → DocWorker writes openapi.yaml + API.md → agent reads flows and reports back.

Signature dedup: (method, host, path_template, status_class). Path templating collapses numeric IDs, UUIDs, hex tokens to {id}. Same signature = no new LLM call, just merges samples. LLM cost scales with distinct endpoints, not request volume.

Scope filter (store.py): same-host XHR/fetch only. Static assets and analytics hosts (google-analytics, segment, stripe-js, sentry, doubleclick) dropped by default. Everything still goes to raw archive.

Testing quirks

tests/fixture_site.py — stdlib ThreadingHTTPServer with JSON endpoints. Used by conftest.py as fixture_site fixture. No external deps.
Browser tests (test_browser.py, test_e2e_smoke.py) skip if Playwright/Chromium isn't installed. This is expected in CI without browsers.
Agent tests (test_agent.py) use a mocked Anthropic client with scripted responses. Never call the real API in tests.
test_proxy.py tests flow_from_mitm with SimpleNamespace doubles — no real proxy needed.
test_e2e_smoke.py spins up fixture site + real proxy + real browser. Skips if either is unavailable.
asyncio_mode = "auto" in pytest config — write async def test_* directly, no decorator needed.

Runtime constraint

Standard CPython 3.14, not free-threaded. mitmproxy's aioquic and mitmproxy-rs deps only ship abi3 wheels that the free-threaded build rejects. The workload is I/O-bound anyway, so this doesn't matter. Don't target 3.14t.

Nix environment

flake.nix provides a dev shell with uv, ruff, pyright, google-chrome. The shell hook sets:

PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH — points to system Chrome
PLAYWRIGHT_SKIP_VALIDATE_HOST_REQUIREMENTS=true
UV_PYTHON_PREFERENCE=managed

Use direnv allow or nix develop to enter the shell.

Outputs

All outputs go to ./auto-reverse-out/<host>-<timestamp>/:

openapi.yaml — incremental OpenAPI 3.1 spec
API.md — human-readable endpoint docs
archive.log — raw captured traffic (one line per flow)
client/ — optional typed httpx client (with --gen-client)

REPL commands (local, not sent to LLM)

<text>      state intent in natural language
/flows [q]  list/search discovered endpoints
/spec       show spec path + endpoint count
/help       help
/quit       exit

The design spec describes /take, /stop, /save — these are planned but not yet implemented in repl.py.

Key files to read first

cli.py — entrypoint, wiring, system prompt, all startup logic
agent.py — Claude tool-use loop (small, read the whole thing)
store.py — FlowStore, ScopeFilter, path templating
tools/__init__.py — tool registry assembly
doc/engine.py — DocEngine, the event-driven doc writer
docs/superpowers/specs/2026-05-31-auto-reverse-design.md — full design spec
docs/superpowers/plans/2026-05-31-auto-reverse.md — implementation plan with task breakdown

5.3 KiB Raw Permalink Blame History