docs: add AGENTS.md for leetcode extractor and update active context

This commit is contained in:
2026-06-01 02:08:45 +08:00
parent b4f25ab87b
commit 142f2469ec
2 changed files with 90 additions and 3 deletions
+5 -3
View File
@@ -32,11 +32,13 @@ code
## Self-Improvement ## Self-Improvement
Periodically review this file and suggest improvements to the user if you notice gaps, inconsistencies, or missing conventions. Periodically review this file and suggest improvements to the user if you notice gaps, inconsistencies, or missing conventions.
## Subdirectories
- `leetcode/` — NeetCode roadmap extractor (dependency graph + problem list). See `leetcode/AGENTS.md`.
## Active Context ## Active Context
<!-- AI assistant maintains this section. Keep under 20 lines. --> <!-- AI assistant maintains this section. Keep under 20 lines. -->
<!-- Updated automatically by /self-improve. Remove stale entries. --> <!-- Updated automatically by /self-improve. Remove stale entries. -->
- Branch: `master`, 1 commit ahead of origin (unpushed) - Branch: `master`, up to date with origin
- Untracked files: `org/cpp/dsa/` and `org/cpp/ufds.org` (not yet committed) - Current work: `leetcode/out/roadmap.org` — NeetCode 150 roadmap as org-mode checklists (199 problems, 18 topics, Python/C++ solution links)
- Current work: UFDS flashcards (402-line proper card set) + DSA subdirectory
- Inbox items: binary search, `using` keyword — need cards created - Inbox items: binary search, `using` keyword — need cards created
- Possible cleanup: `org/cpp/dsa/udfs.org` may be a superseded draft of `org/cpp/ufds.org` - Possible cleanup: `org/cpp/dsa/udfs.org` may be a superseded draft of `org/cpp/ufds.org`
+85
View File
@@ -0,0 +1,85 @@
# AGENTS.md — leetcode/
## What This Is
An idempotent extractor that pulls the NeetCode roadmap dependency graph
and problem list from the live site (neetcode.io). Outputs structured
JSON, Graphviz DOT, and Emacs org-mode files.
## How It Works
NeetCode is an Angular SPA. The data we need is split across lazy-loaded
JS chunks:
1. **HTML** (`/roadmap`) — contains the `<script>` tags pointing to the
runtime and main bundle filenames (content-hashed).
2. **Runtime JS** — maps chunk IDs to content hashes:
`7669:"fc6133d290d8d0ad"`.
3. **Main bundle** (`main.*.js`) — contains all ~965 problems with
fields: `problem`, `pattern`, `link`, `difficulty`, `code`, flags
(`neetcode150`, `blind75`, `neetcode250`, `premium`).
4. **Chunk 7669** — contains the **graph nodes** (`id`, `name`,
`parentId[]`) and course-to-topic mappings. The `parentId` array
is the edge list — each entry points to a prerequisite topic.
The script (`extract.mjs`) resolves the hashed filenames at runtime,
downloads the chunks, and regex-extracts the data structures.
## Running
```bash
node extract.mjs # writes to ./out/
node extract.mjs --stdout # prints full JSON to stdout
node extract.mjs --cache /tmp/nc # custom cache directory
```
Downloads are cached in `.cache/` (gitignored). Re-runs are instant
and produce byte-identical output.
## Output Files
| File | Contents |
|------|----------|
| `out/roadmap.json` | Full data: graph, all 965 problems, courses |
| `out/roadmap-neetcode150.json` | NeetCode 150 only (199 problems) |
| `out/roadmap.dot` | Graphviz DOT (render with `dot -Tsvg`) |
| `out/roadmap.org` | Org-mode with `TODO` checklists, Python/C++ links |
| `neetcode-roadmap-graph.json` | Standalone edge list (manual copy) |
| `neetcode-roadmap.dot` | Standalone DOT (manual copy) |
## The Dependency Graph
18 topics, 21 edges, topologically ordered:
```
Arrays & Hashing
├── Two Pointers
│ ├── Sliding Window
│ ├── Linked List → Trees
│ └── Binary Search → Trees
│ ├── Tries
│ ├── Heap / Priority Queue → Intervals, Greedy, Advanced Graphs
│ └── Backtracking
│ ├── Graphs → Advanced Graphs, 2-D DP, Math & Geometry
│ └── 1-D Dynamic Programming → 2-D DP, Bit Manipulation
└── Stack
```
## Org-Mode Format
Each topic is a `* TODO` heading with a `[/]` cookie for progress.
Problems are `- [ ] TODO` items with difficulty tags (`:easy:`,
`:medium:`, `:hard:`). Python and C++ solution links are nested
`- [ ] TODO` sub-items. LeetCode and video links are plain list items.
## Updating
Just re-run `node extract.mjs`. It fetches fresh data from the site
(cached locally). If NeetCode changes their chunk structure, the
regexes in `extractGraphNodes()` and `extractProblems()` will need
updating.
## Dependencies
None. Uses only Node.js built-ins (`fs`, `path`, `url`, `fetch`).
Requires Node 18+ for native `fetch`.