5.3 KiB
AGENTS.md
What this is
auto-reverse is a conversational CLI where an LLM agent and a human collaborate to reverse-engineer a website's API. The agent drives a headed browser; an embedded mitmproxy captures all traffic; endpoints are documented into OpenAPI + markdown in real time.
The critical design point: the agent and user work together in a loop. The agent doesn't run autonomously — it pursues intent, reports findings, and asks the user what to do next. The user steers, redirects, or takes over the browser at any time.
Agent-user interaction model
The REPL loop in repl.py is the interface. Non-/ lines go to the Claude agent (agent.py). The agent has three tool groups:
browser_*(navigate, click, type, snapshot) — act on the page, return compact snapshots (url + title + up to 40 interactive elements, not raw HTML)flows_search— query the FlowStore for captured endpointsdoc_document— enrich an endpoint with summary/description/tag, triggers spec rewrite
Each agent turn runs up to 25 tool iterations (MAX_ITERATIONS in agent.py). The agent should: pursue intent → inspect traffic → enrich endpoints → summarize → ask user what next.
The system prompt is in cli.py:27-34. It's intentionally brief — the tools shape behavior more than the prompt.
Commands
uv sync # install deps
uv run playwright install chromium # one-time browser install
uv run pytest -q # run all tests
uv run pytest tests/test_store.py -q # run one test file
uv run pytest -k test_name -q # run one test
uv run auto-reverse https://example.com # run the tool
Lint/typecheck (from nix shell or install separately):
ruff check src/
pyright src/
Architecture (read before editing)
Single Python process, four concurrent roles:
| Thread | Role | Key files |
|---|---|---|
| Main | REPL + Claude tool-use loop | repl.py, agent.py |
| Main (startup) | CLI arg parsing, wiring | cli.py, config.py |
| mitmproxy thread | Embedded DumpMaster, capture addon | proxy.py |
| Doc Worker thread(s) | Schema inference + LLM enrichment on new signatures | doc/engine.py, doc/schema.py |
| Node subprocess | Playwright browser | browser.py |
Data flow: user intent → agent acts via browser → proxy captures → FlowStore deduplicates by signature → DocWorker writes openapi.yaml + API.md → agent reads flows and reports back.
Signature dedup: (method, host, path_template, status_class). Path templating collapses numeric IDs, UUIDs, hex tokens to {id}. Same signature = no new LLM call, just merges samples. LLM cost scales with distinct endpoints, not request volume.
Scope filter (store.py): same-host XHR/fetch only. Static assets and analytics hosts (google-analytics, segment, stripe-js, sentry, doubleclick) dropped by default. Everything still goes to raw archive.
Testing quirks
tests/fixture_site.py— stdlibThreadingHTTPServerwith JSON endpoints. Used byconftest.pyasfixture_sitefixture. No external deps.- Browser tests (
test_browser.py,test_e2e_smoke.py) skip if Playwright/Chromium isn't installed. This is expected in CI without browsers. - Agent tests (
test_agent.py) use a mocked Anthropic client with scripted responses. Never call the real API in tests. test_proxy.pytestsflow_from_mitmwithSimpleNamespacedoubles — no real proxy needed.test_e2e_smoke.pyspins up fixture site + real proxy + real browser. Skips if either is unavailable.asyncio_mode = "auto"in pytest config — writeasync def test_*directly, no decorator needed.
Runtime constraint
Standard CPython 3.14, not free-threaded. mitmproxy's aioquic and mitmproxy-rs deps only ship abi3 wheels that the free-threaded build rejects. The workload is I/O-bound anyway, so this doesn't matter. Don't target 3.14t.
Nix environment
flake.nix provides a dev shell with uv, ruff, pyright, google-chrome. The shell hook sets:
PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH— points to system ChromePLAYWRIGHT_SKIP_VALIDATE_HOST_REQUIREMENTS=trueUV_PYTHON_PREFERENCE=managed
Use direnv allow or nix develop to enter the shell.
Outputs
All outputs go to ./auto-reverse-out/<host>-<timestamp>/:
openapi.yaml— incremental OpenAPI 3.1 specAPI.md— human-readable endpoint docsarchive.log— raw captured traffic (one line per flow)client/— optional typed httpx client (with--gen-client)
REPL commands (local, not sent to LLM)
<text> state intent in natural language
/flows [q] list/search discovered endpoints
/spec show spec path + endpoint count
/help help
/quit exit
The design spec describes /take, /stop, /save — these are planned but not yet implemented in repl.py.
Key files to read first
cli.py— entrypoint, wiring, system prompt, all startup logicagent.py— Claude tool-use loop (small, read the whole thing)store.py— FlowStore, ScopeFilter, path templatingtools/__init__.py— tool registry assemblydoc/engine.py— DocEngine, the event-driven doc writerdocs/superpowers/specs/2026-05-31-auto-reverse-design.md— full design specdocs/superpowers/plans/2026-05-31-auto-reverse.md— implementation plan with task breakdown