Building a VS Code Extension for AI Context Management: Architecture, APIs, and Limitations

TL;DR

VS Code's Extension API provides six critical endpoints for building AI context management: window.tabGroups, window.activeTextEditor, window.visibleTextEditors, workspace.textDocuments, workspace.fs, and languages.getDiagnostics. Together, they give you complete read access to the developer's IDE state — open files, focused editor, visible panes, loaded documents, file system, and current errors. Building a context management extension means wiring these APIs into a real-time state machine that continuously packages your session context for AI consumption.

The Extension API Nobody Uses

VS Code's extension API is one of the most well-documented, most accessible, and most underutilized APIs in developer tooling. It exposes your complete IDE state — every open tab, every visible editor, every active diagnostic — through clean, event-driven TypeScript interfaces.

Yet the vast majority of AI coding extensions use exactly one API call: window.activeTextEditor.document.getText(). They read the current file, send it to an LLM, and display the response. The other five critical endpoints sit unused — containing exactly the context data that would prevent hallucinations.

The data your AI needs to stop hallucinating is one API call away. The overwhelming majority of extensions don't make that call.

The Six Critical API Endpoints

Here are the VS Code APIs that matter for AI context management, what they expose, and how to use them:

Returns every open tab across all tab groups. Each tab includes: the file URI, whether it's active, whether it's pinned, and its position. This is the definitive signal of the developer's working set — they opened these files because they're relevant.

Returns the currently focused editor instance. Includes: document contents, cursor position, visible range, selections, and the file URI. This is the highest-priority context — the file the developer is actively editing.

Returns all loaded text documents in memory. This includes open files, files referenced by the language server, and any documents loaded by other extensions. Larger set than tabGroups, useful for discovering implicitly relevant files.

The Event-Driven State Machine

A context management extension isn't a request-response tool. It's a real-time state machine that updates continuously as the developer works. Here's the event architecture:

// Core Event Subscriptions:

window.onDidChangeActiveTextEditor → Fires on every tab switch

workspace.onDidOpenTextDocument → Fires when a new file opens

workspace.onDidCloseTextDocument → Fires when a file closes

workspace.onDidChangeTextDocument → Fires on every edit

workspace.onDidSaveTextDocument → Fires on every save

// State Object (updated by events):

{ activeFile: string,

openTabs: string[],

visibleEditors: string[],

recentEdits: { file, range, timestamp }[],

diagnostics: { file, severity, message }[] }

This state object is the ground truth of the developer's current session. Every AI completion should receive it as mandatory context. The challenge is keeping it updated with sub-100ms latency.

The Extension Host Performance Ceiling

Here's the hard truth about VS Code extensions: they run in the Extension Host process, which is a single Node.js runtime shared by every installed extension.

Metric100msTHE LATENCY BUDGET FOR CONTEXT ASSEMBLY BEFORE COMPLETION PIPELINES DEGRADE

The Extension Host process handles: IntelliSense, linting, formatting, Git integration, and every third-party extension — all in one JavaScript runtime. Heavy file parsing (AST analysis, import resolution) in this process competes directly with your editor's responsiveness. If your context extension's event handler takes 150ms, the editor stutters. This is why advanced context tools offload heavy processing to a companion binary (written in Rust or Go) that communicates with the extension via IPC.

The Companion Process Architecture

The most effective VS Code context extensions use a split architecture:

Lightweight event listener. Subscribes to window and workspace events. Sends state change notifications to the companion process via IPC. Does NOT perform heavy computation — just observes and forwards.

Receives state notifications from the extension. Performs import graph resolution, file content packaging, and context optimization. Returns structured context blocks to the extension.

The companion process also serves as an MCP server. AI clients (Cursor, Claude, Windsurf) query the MCP server directly for context, bypassing the extension entirely for read operations.

Both the extension and companion process share a lightweight state file or local database. The extension writes IDE events; the companion process reads them and compute context. This decouples observation from processing.

The extension monitors the companion process health. If the process crashes, the extension notifies the developer via status bar and restarts it automatically. Zero-downtime context injection.

What You Can't Do from an Extension

The extension sandbox has real limitations that affect context management:

No system-level file watching: VS Code's file watcher is workspace-scoped. You can't watch files outside the open workspace. If a dependency updates in node_modules, you won't see it until VS Code's file indexer catches up.

No native binary execution: Extensions can't directly spawn native processes without the user's explicit trust approval. This is a security feature, but it means you can't run optimized Rust parsers directly from the extension.

No persistent background execution: Extensions activate on events and can be deactivated when idle. A context engine must run continuously. The workaround is to register for high-frequency events (like text document changes) to prevent deactivation.

Beyond the Extension: The Full Architecture

The ideal AI context management architecture is not a VS Code extension. It's a system that uses the extension as a sensor and a separate companion process as the brain. The extension observes; the companion process thinks.

🔧 Extension + companion. Observation + intelligence.

Context Snipe uses a lightweight VS Code extension for IDE state observation paired with a Rust companion process for context assembly and MCP serving. The extension stays under 5ms per event. The Rust process handles the heavy lifting. Start free — no credit card →

Building a VS Code Extension for AI Context Management: Architecture, APIs, and Limitations

TL;DR

The Extension API Nobody Uses

The Six Critical API Endpoints

window.tabGroups.all

window.activeTextEditor

workspace.textDocuments

The Event-Driven State Machine

The Extension Host Performance Ceiling

The Companion Process Architecture

Extension Layer (TypeScript)

Companion Process (Rust/Go)

MCP Server Interface

Shared State Store

Health Monitoring

What You Can't Do from an Extension

Beyond the Extension: The Full Architecture