


the wiseduckdev
ArchMapper is a fully automated, resumable, bottom-up reverse-engineering pipeline that turns any codebase into a structured architectural document, ready for human or LLM analysis.

In spring 2026, while working on Sophia.ai — my personal R&D project on agentic AI for software development — I needed to truly understand how OpenClaw, one of the most capable open-source agentic-AI ecosystems, was architected. The repos involved were too large to read manually, and I needed a structured big-picture I could feed into an LLM for deeper reasoning. ArchMapper was born from that need: a fully automated, resumable, bottom-up reverse-engineering pipeline that turns any codebase into a clean architectural document, ready for human or LLM analysis.
What started as a tool to support a personal R&D effort evolved into a universal solution: ArchMapper works on any repository, in any language, regardless of scale — including codebases of more than 10,000 files. It is the first project in which I formalized my engineering methodology into a complete standards-driven framework, marking a turning point in the way I build software in the era of agentic AI.
ArchMapper is a TypeScript-based, fully automated reverse-engineering pipeline that ingests any source-code repository and produces a complete structured analysis: a per-file description (purpose, exports, imports, key abstractions, design patterns), a per-folder synthesis (module purpose, public API, cross-cutting concerns), and a top-level architectural summary. The output is a single canonical `architecture.json` plus one Markdown spec card per module, designed to be both human-readable and directly consumable by Large Language Models for deeper architectural reasoning.
The pipeline is built around a bottom-up wave synthesis: every source file is analyzed first, then leaf folders are synthesized from their child files, then parent folders from their child modules, all the way up to the project root. Each level is built on verified facts from the previous level, never on guesses. This methodology guarantees that the final architectural understanding is grounded in real evidence at every step, not in LLM hallucinations.
ArchMapper is also resumable: a single `progress.json` checkpoint, written atomically after every meaningful step, allows any interrupted run to continue from where it stopped — even on codebases of 10,000+ files. Combined with Anthropic's Message Batches API (50% cost reduction) and prompt caching (~90% reduction on repeated system prompts), it makes large-scale reverse-engineering economically viable for individual developers and small teams.
The output is intentionally LLM-friendly. The `architecture.json` document and the spec cards are not designed merely to be archived — they are designed to be passed back into an LLM as context for follow-up analysis: comparing two architectures, identifying tech debt, suggesting refactors, drafting onboarding documentation, or feeding a coding agent with the precise structural map it needs to act safely on a codebase it has never seen. ArchMapper produces the substrate; the LLM does the reasoning.
Public release: spring 2026, MIT-licensed. ArchMapper has already been used to analyze multiple production-grade open-source ecosystems, including OpenClaw, with results that directly informed architectural decisions on parallel projects.
ArchMapper is a fully solo project. I owned every layer, from the methodology framework down to the last unit test:
From the methodology that frames the entire build to the smallest implementation choice, every decision was driven by one principle: produce a tool reliable enough to be trusted on production-grade codebases, and disciplined enough to be extended safely by AI agents.
ArchMapper was not born from a market study or a startup pitch. It was born from a very practical, very personal need: I was building Sophia.ai, my R&D project on agentic AI for software development, and I needed to deeply understand how OpenClaw — one of the most capable agentic-AI ecosystems in the open source world — had been architected. I wanted to know what they did, how they did it, and most importantly, why they made the technical choices they made, so that I could take what was worth keeping and adapt it to my own work.
The challenge: the repositories that compose the OpenClaw ecosystem are massive. Some of them exceed 10,000 files. Reading them by hand was simply not realistic, and feeding them raw into an LLM was both expensive and ineffective — the model would lose context, hallucinate connections, and miss the structural shape entirely. I needed a way to produce a clean, structured big-picture of any codebase, in a format an LLM could actually reason over.
From that need, ArchMapper emerged. But I made one decision early on: I did not want to build a one-off tool tailored to OpenClaw. I wanted to build something I could point at any repository, in any language, at any time. Because the truth is, this kind of structural understanding is rarely a one-time need. Every time you join a new codebase, every time you audit an open-source dependency, every time you onboard an AI agent on a project, you face the same problem. So I built the universal solution — and that is exactly what ArchMapper became.
My background as a full-stack developer for over two years, with an early focus on AI integration and automation, gave me a clear mental model of what the implementation needed to look like and what kind of result would actually be useful in practice. ArchMapper is the direct product of that experience: a tool I would have wanted from day one, now available to anyone who needs it.
ArchMapper is the first project in which I rigorously applied the full set of TDD, SOLID, DRY, KISS, and YAGNI methodologies, and the first one for which I authored a complete operational framework before writing a line of pipeline code. The framework lives in `CLAUDE.md` (the project's mandatory read-on-every-session document) and in 16 dedicated standards files under `standards/`, each governing one specific concern: workflow, code quality, TypeScript, environment, pipeline architecture, reverse-engineering methodology, unit testing, integration testing, LLM security, supply chain security, git branches, git commits, documentation, debugging, and analysis output reading.
Each ticket follows the same disciplined sequence, every single time: read `CLAUDE.md` and load the relevant standards, run the baseline test suite to confirm a green starting point, produce an outcome map (every possible exit path with its log entry designed in advance), write a failing test that captures the exact requirement, implement the minimum code to make it pass, run the type checker, refactor if needed, update every affected document in the same commit, run the LLM security checklist, review the full diff, and only then write the commit (one commit per ticket, prefixed with the issue ID, with a co-author line when an AI assisted).
This level of methodological rigor is not bureaucratic — it is what makes ArchMapper safely extensible by both humans and AI agents. When an LLM-powered coding assistant operates on the project, it does not need to guess my conventions. It reads `CLAUDE.md`, loads the relevant standard, and works inside an explicit, testable framework. The standards are the consensus that makes human-AI collaboration possible at the level of quality I expect for production-grade software.
On the strategic side, every architectural choice was driven by a small set of non-negotiable principles: arrays not Sets in any persisted JSON structure (because Sets silently serialize to `{}`), atomic writes for the resume checkpoint (tmp file + rename, never partial), bottom-up wave order strictly enforced (a parent module is never synthesized before its children), batch results matched by `custom_id` and never by index (because Anthropic does not guarantee result order), and per-item failure tolerance (one bad file moves to a `failed[]` array; the pipeline never aborts on a single error).
The methodology was iterative and pragmatic: ship the smallest piece that fully satisfies one acceptance criterion, validate it on a real codebase (OpenClaw, then Claude Code, then progressively larger targets), capture every learning in a standard or in `CLAUDE.md`, and only then move to the next ticket. By the time ArchMapper had analyzed three real-world ecosystems, the standards had converged into a stable, reusable methodology applicable to any future tool I build.
TypeScript was chosen for its strict type system, which makes the boundary between unknown LLM output and typed internal data structures explicit. Anthropic's Claude family was selected for its tool-use API (which enforces structured JSON output) and its Message Batches API, which makes large-scale analysis economically viable. Zod was chosen as the validation layer because it produces field-level error details that integrate cleanly with structured logging and per-item failure tracking. Vitest was preferred over Jest for its native TypeScript support, fast execution, and minimal configuration overhead.
Cost optimization is a first-class concern. Every Anthropic API call sets `max_tokens` explicitly, every system prompt is wrapped in `cache_control: ephemeral` to enable prompt caching, and large analyses run through the Message Batches API to halve the per-token cost. Combined, these choices make a complete reverse-engineering analysis of a 10,000-file repository affordable on a personal Anthropic account.
ArchMapper is a CLI pipeline tool — it has no GUI, no web interface, and no streaming dashboard. That choice is deliberate, not a limitation. The tool's primary user is not a human reading a dashboard but an LLM consuming a structured document. The real UX is the shape of the output: a single canonical `architecture.json` engineered to be the cleanest possible substrate for downstream LLM reasoning, plus a folder of Markdown spec cards for human inspection.
Every output field is designed for clarity and grounding. The `purpose` of every file is bounded to 2–4 sentences and must be specific enough to distinguish it from any other file in the codebase. The `patterns` field uses a controlled vocabulary of 19 named design patterns (factory, singleton, dependency-injection, middleware, adapter, facade, observer, strategy, builder, repository, validator, pipeline, config-module, barrel-export, type-guard, error-boundary, template-method, command, registry) — no invented synonyms, no woolly adjectives. The `crossCuttingConcerns` field uses a similarly controlled vocabulary at the module level.
The biggest UX challenge was preventing LLM hallucination at scale. With thousands of analysis calls per run, even a 1% hallucination rate would corrupt the architecture document with phantom imports, invented patterns, or fabricated module relationships. The solution was layered: structured-output enforcement via Anthropic's tool-use API, Zod schema validation as a hard gate before any artifact is written, an explicit anti-injection instruction in every system prompt to neutralize adversarial content embedded in analyzed source files, a copyright constraint that prevents the agent from reproducing verbatim source code, and per-item failure tracking so a single bad analysis is logged and isolated rather than poisoning the whole run.
Discoverability and trust are reinforced by the spec cards: one Markdown file per module, in a stable format (Purpose / Public API / Design Patterns / Cross-Cutting Concerns / Children / Notes), so any human reader — and any LLM downstream consumer — knows exactly where to look for what. The result is a tool that produces output a developer can trust, share, version-control, and feed back into another agent without manual cleanup.
ArchMapper is distributed as a self-contained CLI: clone the repo, install dependencies with `npm install`, configure `.env` with an Anthropic API key and a project name, drop the codebase to analyze under `project/`, and run `npm run analyze`. There is no cloud component, no managed service, no billable infrastructure beyond the user's own Anthropic API usage. This is deliberate: the tool runs locally, on the user's machine, with their credentials, against codebases they control.
Scalability is handled through three mechanisms. First, full resumability: the atomic `progress.json` checkpoint allows any interrupted run — whether by a crash, a network timeout, or a manual stop — to continue from the last successful step, without recomputing anything. Second, per-item failure tolerance: a single Zod parse failure or a single Anthropic API error moves that file to a `failed[]` list and the pipeline continues. Third, batching: when `USE_BATCHES=true`, the pipeline submits requests through the Message Batches API in groups of up to 10,000, polls with exponential backoff, and matches results back by stable `custom_id` (the relative file path) to handle very large repositories economically.
ArchMapper has been validated against repositories of more than 10,000 files. The hard limit is not the tool itself but the user's Anthropic rate limits and budget, both of which can be tuned via batch mode and prompt caching. Because all output is plain JSON and Markdown, there is no proprietary format and no lock-in: the analysis can be diffed, versioned, archived, or transformed into any downstream representation the user needs.
In short, the pipeline is local-first, resumable by design, and built to run unattended on production-scale codebases.
ArchMapper is MIT-licensed: anyone — developer, researcher, agent — is free to use, fork, and extend it. Contributions are welcome and follow the same standards-driven workflow as the core project.
ArchMapper is not a customer-facing SaaS — it is an engineering tool, evaluated on the quality of the analyses it produces and on its impact on parallel projects. By that measure, the outcomes have been clear and significant.
ArchMapper has been used to analyze multiple production-grade open-source ecosystems, including OpenClaw and Claude Code — some of which exceed 10,000 source files. The resulting `architecture.json` documents have been used directly as context in LLM conversations, and the format proved exactly as effective as designed: the model parses the document without hesitation, answers architectural questions with grounded specificity, and surfaces non-obvious connections between modules. That alone is a strong validation of the bottom-up wave methodology and of the schema design.
On a more personal level, the analyses produced by ArchMapper have already informed the architecture of parallel projects (notably Sophia.ai), helped me discover technical choices and patterns I had not encountered before, and accelerated my understanding of complex agentic-AI ecosystems by an order of magnitude. The tool also reinforced my expertise in AI integration, automation, prompt engineering, and large-scale structured-output workflows — capabilities that compound directly into every future project.
Public feedback is still pending — the tool was open-sourced very recently. But the internal results were strong enough that ArchMapper's methodology is now the template for every new project I start, and is the direct progenitor of UXMapper, its sibling tool focused on runtime UX analysis.
ArchMapper taught me, more than any project before it, how to set up a modern repository on which agentic AI is going to work alongside me. Today, removing AI from a developer's workflow is no longer a realistic option. The work itself is shifting: from monitoring what AI is producing to genuinely collaborating with it. And like any real collaboration, that requires consensus — explicit standards, an agreed methodology, a clearly defined workflow, and a framework the agent can rely on rather than improvise around. ArchMapper is the first project where I built that consensus from the ground up, and the difference in productivity, accuracy, and reliability was immediate and undeniable.
Beyond the methodology, ArchMapper marked the first project in which I rigorously applied the full TDD, SOLID, DRY, KISS, and YAGNI principles, end-to-end. Once the project was clearly defined, once the standards were locked in, the rest became disciplined homework: design the right architecture, build it the right way, integrate carefully, ship one ticket at a time. The results spoke for themselves. My code became measurably more accurate, more maintainable, and more resilient. My work ethic was propelled to a new level. And the projects I have started since have all benefited from this discipline — both in their code and in how they collaborate with AI.
On the technical side, I learned an enormous amount about reverse engineering at scale: how to design an analysis schema that grounds the LLM in verifiable facts, how to choose the right format for results so they remain useful both immediately and as substrate for further LLM reasoning, and how to build a pipeline that survives interruptions, errors, and adversarial input without compromising integrity. I also deepened my mastery of advanced Anthropic API features — Message Batches for cost-efficient large-scale runs and prompt caching for repeated context — which I now consider standard tooling for any AI-integration work I do.
Strategically, this project clarified my conviction that the most valuable engineering tools today are not those that hide complexity but those that surface it cleanly. ArchMapper does not solve software complexity — it makes complexity legible to humans and to LLMs alike. That is the kind of leverage I want to keep building.
Finally, ArchMapper reinforced something I now consider foundational: by being able to reverse-engineer successful open-source projects quickly and professionally, we give ourselves the means to improve on them, to build better products, and ultimately to ship a brighter future. The faster we can understand great work, the faster we can stand on its shoulders. That is the philosophy that ties every Wise Duck Dev project together — useful, innovative tools that compound on themselves and amplify what humans and AI can build, side by side.
Key takeaway: define your standards before you write the code, build the framework that AI will collaborate with you on, and then iterate relentlessly. The right discipline turns AI from an unpredictable assistant into a reliable engineering partner — and that is the most important shift in the way we build software today.