docs: add AGENTS.md for agent-user interaction model and repo conventions
This commit is contained in:
@@ -0,0 +1,106 @@
|
|||||||
|
# AGENTS.md
|
||||||
|
|
||||||
|
## What this is
|
||||||
|
|
||||||
|
`auto-reverse` is a **conversational CLI** where an LLM agent and a human collaborate to reverse-engineer a website's API. The agent drives a headed browser; an embedded mitmproxy captures all traffic; endpoints are documented into OpenAPI + markdown in real time.
|
||||||
|
|
||||||
|
The critical design point: **the agent and user work together in a loop**. The agent doesn't run autonomously — it pursues intent, reports findings, and asks the user what to do next. The user steers, redirects, or takes over the browser at any time.
|
||||||
|
|
||||||
|
## Agent-user interaction model
|
||||||
|
|
||||||
|
The REPL loop in `repl.py` is the interface. Non-`/` lines go to the Claude agent (`agent.py`). The agent has three tool groups:
|
||||||
|
|
||||||
|
- **`browser_*`** (navigate, click, type, snapshot) — act on the page, return compact snapshots (url + title + up to 40 interactive elements, not raw HTML)
|
||||||
|
- **`flows_search`** — query the FlowStore for captured endpoints
|
||||||
|
- **`doc_document`** — enrich an endpoint with summary/description/tag, triggers spec rewrite
|
||||||
|
|
||||||
|
Each agent turn runs up to 25 tool iterations (`MAX_ITERATIONS` in `agent.py`). The agent should: pursue intent → inspect traffic → enrich endpoints → summarize → ask user what next.
|
||||||
|
|
||||||
|
The system prompt is in `cli.py:27-34`. It's intentionally brief — the tools shape behavior more than the prompt.
|
||||||
|
|
||||||
|
## Commands
|
||||||
|
|
||||||
|
```bash
|
||||||
|
uv sync # install deps
|
||||||
|
uv run playwright install chromium # one-time browser install
|
||||||
|
uv run pytest -q # run all tests
|
||||||
|
uv run pytest tests/test_store.py -q # run one test file
|
||||||
|
uv run pytest -k test_name -q # run one test
|
||||||
|
uv run auto-reverse https://example.com # run the tool
|
||||||
|
```
|
||||||
|
|
||||||
|
Lint/typecheck (from nix shell or install separately):
|
||||||
|
```bash
|
||||||
|
ruff check src/
|
||||||
|
pyright src/
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture (read before editing)
|
||||||
|
|
||||||
|
Single Python process, four concurrent roles:
|
||||||
|
|
||||||
|
| Thread | Role | Key files |
|
||||||
|
|--------|------|-----------|
|
||||||
|
| Main | REPL + Claude tool-use loop | `repl.py`, `agent.py` |
|
||||||
|
| Main (startup) | CLI arg parsing, wiring | `cli.py`, `config.py` |
|
||||||
|
| mitmproxy thread | Embedded DumpMaster, capture addon | `proxy.py` |
|
||||||
|
| Doc Worker thread(s) | Schema inference + LLM enrichment on new signatures | `doc/engine.py`, `doc/schema.py` |
|
||||||
|
| Node subprocess | Playwright browser | `browser.py` |
|
||||||
|
|
||||||
|
Data flow: user intent → agent acts via browser → proxy captures → FlowStore deduplicates by signature → DocWorker writes openapi.yaml + API.md → agent reads flows and reports back.
|
||||||
|
|
||||||
|
**Signature dedup**: `(method, host, path_template, status_class)`. Path templating collapses numeric IDs, UUIDs, hex tokens to `{id}`. Same signature = no new LLM call, just merges samples. LLM cost scales with distinct endpoints, not request volume.
|
||||||
|
|
||||||
|
**Scope filter** (`store.py`): same-host XHR/fetch only. Static assets and analytics hosts (google-analytics, segment, stripe-js, sentry, doubleclick) dropped by default. Everything still goes to raw archive.
|
||||||
|
|
||||||
|
## Testing quirks
|
||||||
|
|
||||||
|
- `tests/fixture_site.py` — stdlib `ThreadingHTTPServer` with JSON endpoints. Used by `conftest.py` as `fixture_site` fixture. No external deps.
|
||||||
|
- Browser tests (`test_browser.py`, `test_e2e_smoke.py`) **skip** if Playwright/Chromium isn't installed. This is expected in CI without browsers.
|
||||||
|
- Agent tests (`test_agent.py`) use a **mocked Anthropic client** with scripted responses. Never call the real API in tests.
|
||||||
|
- `test_proxy.py` tests `flow_from_mitm` with `SimpleNamespace` doubles — no real proxy needed.
|
||||||
|
- `test_e2e_smoke.py` spins up fixture site + real proxy + real browser. Skips if either is unavailable.
|
||||||
|
- `asyncio_mode = "auto"` in pytest config — write `async def test_*` directly, no decorator needed.
|
||||||
|
|
||||||
|
## Runtime constraint
|
||||||
|
|
||||||
|
**Standard CPython 3.14, not free-threaded.** mitmproxy's `aioquic` and `mitmproxy-rs` deps only ship `abi3` wheels that the free-threaded build rejects. The workload is I/O-bound anyway, so this doesn't matter. Don't target `3.14t`.
|
||||||
|
|
||||||
|
## Nix environment
|
||||||
|
|
||||||
|
`flake.nix` provides a dev shell with uv, ruff, pyright, google-chrome. The shell hook sets:
|
||||||
|
- `PLAYWRIGHT_CHROMIUM_EXECUTABLE_PATH` — points to system Chrome
|
||||||
|
- `PLAYWRIGHT_SKIP_VALIDATE_HOST_REQUIREMENTS=true`
|
||||||
|
- `UV_PYTHON_PREFERENCE=managed`
|
||||||
|
|
||||||
|
Use `direnv allow` or `nix develop` to enter the shell.
|
||||||
|
|
||||||
|
## Outputs
|
||||||
|
|
||||||
|
All outputs go to `./auto-reverse-out/<host>-<timestamp>/`:
|
||||||
|
- `openapi.yaml` — incremental OpenAPI 3.1 spec
|
||||||
|
- `API.md` — human-readable endpoint docs
|
||||||
|
- `archive.log` — raw captured traffic (one line per flow)
|
||||||
|
- `client/` — optional typed httpx client (with `--gen-client`)
|
||||||
|
|
||||||
|
## REPL commands (local, not sent to LLM)
|
||||||
|
|
||||||
|
```
|
||||||
|
<text> state intent in natural language
|
||||||
|
/flows [q] list/search discovered endpoints
|
||||||
|
/spec show spec path + endpoint count
|
||||||
|
/help help
|
||||||
|
/quit exit
|
||||||
|
```
|
||||||
|
|
||||||
|
The design spec describes `/take`, `/stop`, `/save` — these are planned but not yet implemented in `repl.py`.
|
||||||
|
|
||||||
|
## Key files to read first
|
||||||
|
|
||||||
|
- `cli.py` — entrypoint, wiring, system prompt, all startup logic
|
||||||
|
- `agent.py` — Claude tool-use loop (small, read the whole thing)
|
||||||
|
- `store.py` — FlowStore, ScopeFilter, path templating
|
||||||
|
- `tools/__init__.py` — tool registry assembly
|
||||||
|
- `doc/engine.py` — DocEngine, the event-driven doc writer
|
||||||
|
- `docs/superpowers/specs/2026-05-31-auto-reverse-design.md` — full design spec
|
||||||
|
- `docs/superpowers/plans/2026-05-31-auto-reverse.md` — implementation plan with task breakdown
|
||||||
Reference in New Issue
Block a user