Instrumentation & Integrations
How dexcost auto-instruments LLM providers and HTTP transports, and how to record costs manually or via framework integrations.
Automatic LLM capture
init() (or a direct new CostTracker()) iterates ALL_SUPPORTED_INSTRUMENTS and, for each provider whose package is installed, dynamically imports the corresponding instrument module and monkey-patches the method that executes LLM calls. The patch stores the original method so uninstrument() can restore it exactly. All call sites in your code continue to call the same functions unmodified — the patch is transparent.
Provider matrix
| Provider | Package | Patched method |
|---|---|---|
| OpenAI | openai | OpenAI.Chat.Completions.prototype.create (non-streaming + streaming) |
| Anthropic | anthropic | anthropic.messages.create (non-streaming + streaming) |
| Vercel AI | ai | Vercel AI SDK generateText / streamText pipeline hooks |
| Google Gemini | @google/generative-ai | GenerativeModel.generateContent |
| AWS Bedrock | @aws-sdk/client-bedrock-runtime | BedrockRuntimeClient.send — InvokeModel and InvokeModelWithResponseStream commands |
| Cohere | cohere-ai | CohereClient.chat / chatStream |
| MCP | @modelcontextprotocol/sdk | Client.callTool |
Each instrument module self-registers via registerInstrument(name, instrumentFn, uninstrumentFn) at import time, so the registry is populated as a side-effect of loading src/core/tracker.ts.
Streaming support
All six LLM providers capture streamed responses. For a streaming call the wrapper returns an AsyncIterable that yields chunks unchanged. Token counts (and therefore cost) are accumulated as usage chunks arrive; the llm_call event — with model, input/output tokens, and latency — is recorded once the stream completes.
Controlling which providers are instrumented
By default all providers in ALL_SUPPORTED_INSTRUMENTS are instrumented. Pass an explicit list to autoInstrument to limit scope:
import { init } from 'dexcost';
// Instrument only OpenAI and Anthropic
init({ apiKey: 'dx_live_...', autoInstrument: ['openai', 'anthropic'] });Pass an empty array to disable all auto-instrumentation:
init({ apiKey: 'dx_live_...', autoInstrument: [] });Valid names are: "openai", "anthropic", "vercel-ai", "gemini", "bedrock", "cohere", "mcp".
Instrument / uninstrument at runtime
CostTracker.instrument(name) and CostTracker.uninstrument(name) let you add or remove a provider after construction:
import { CostTracker } from 'dexcost';
const tracker = new CostTracker({ autoInstrument: [] });
// Add a provider later
await tracker.instrument('openai');
// Restore the original, un-patched method
tracker.uninstrument('openai');instrument() is async because it dynamically imports the provider package. It is idempotent — calling it twice has no effect. uninstrument() is synchronous and safe to call even if the instrument is not currently active.
HTTP & non-LLM cost capture
When trackHttp is true (the default), CostTracker calls trackHttp(buffer) from the HTTP adapter, which patches two transport layers:
| Layer | What is patched |
|---|---|
Global fetch | globalThis.fetch is replaced with a wrapper that calls the original and then records a cost event |
Node.js http / https | http.request, http.get, https.request, https.get are wrapped via CommonJS require to capture calls made by SDKs that bypass the global fetch (e.g., AWS SDK v2) |
Both layers are patched by a single trackHttp() call. untrackHttp() restores all four entry points. The HTTP adapter never crashes your code — any failure in the patch is swallowed and HTTP tracking is silently skipped.
Service catalog
The bundled service catalog maps external service domains to pricing rules. When an HTTP call's hostname matches a catalog entry, an external_cost event is recorded automatically — no registration needed.
To refresh the catalog from a remote URL at startup:
import { init } from 'dexcost';
init({
apiKey: 'dx_live_...',
serviceCatalogUrl: 'https://catalog.example.com/dexcost-catalog.json',
});CostTracker calls catalog.refreshFromUrl(url) during init. The in-process catalog is replaced with the downloaded data.
Domain rate registry
For services not in the built-in catalog, register a per-request rate with registerDomainRate. User-registered rates take precedence over catalog entries:
import { registerDomainRate } from 'dexcost';
registerDomainRate('api.example.com', 0.01); // $0.01 per requestAfter registration, every HTTP call whose hostname is api.example.com records an external_cost event with costUsd: 0.01. The optional third argument sets the per label (default "request").
Cost attribution
The HTTP adapter attributes costs to the currently active task (getCurrentTask()). When no task is active, the adapter creates a lightweight auto-task so costs are never silently lost.
Manual recording
Use TrackedTask methods when auto-instrumentation does not cover your cost source or you need fine-grained control.
Tasks
Open a task with tracker.track(). All costs recorded inside the callback are grouped under that task:
import { init, track } from 'dexcost';
init({ apiKey: 'dx_live_...' });
await track({ taskType: 'generate_report', customerId: 'acme-corp' }, async (task) => {
// Auto-captured LLM calls land here automatically
task.recordCost('pdf_renderer', 0.002);
task.recordCost('cloud_storage', 0.0001, { bucket: 'reports' }, 'compute_cost');
});When the callback resolves, the task status is set to "success" and all events are flushed to the local SQLite buffer. If the callback throws, the task is marked "failed" but events are still persisted.
Nesting is supported via AsyncLocalStorage. Any track() call inside an active task inherits the outer task's ID as parentTaskId:
await track({ taskType: 'pipeline' }, async () => {
await track({ taskType: 'step_one' }, async (inner) => {
// inner.task.parentTaskId === outer task's taskId
});
});Manual start/end via startTask() is available for architectures where callbacks do not fit (e.g., event-driven workers):
import { CostTracker } from 'dexcost';
const tracker = new CostTracker();
const task = tracker.startTask({ taskType: 'worker_job', customerId: 'acme' });
try {
task.recordCost('queue', 0.0005);
task.end('success');
} catch (err) {
task.end('failed');
throw err;
}The caller must call task.end() exactly once. Calling it more than once throws an Error.
recordCost
Record any non-LLM cost as a dollar amount:
task.recordCost(
'google_maps_api',
0.005,
{ endpoint: '/geocode', region: 'us-east-1' }, // details
'external_cost', // eventType: "external_cost" | "compute_cost"
'exact', // costConfidence
'manual', // pricingSource
);eventType must be "external_cost" or "compute_cost" — any other value throws an Error.
recordUsage
Compute cost from the rate registry. Register the rate once, then call recordUsage anywhere:
tracker.registerRate('maps.googleapis.com', 'request', 0.005);
await tracker.track({ taskType: 'route_calculation' }, async (task) => {
task.recordUsage('maps.googleapis.com', 3); // records 3 × $0.005 = $0.015
});Throws if no rate is registered for the given service name.
recordLlmCall
Manually record an LLM call — useful for providers not in the auto-instrument list, local models, or testing:
task.recordLlmCall(
'openai', // provider
'gpt-4o', // model
800, // inputTokens
150, // outputTokens
// cost omitted → auto-computed via PricingEngine
undefined, // cost
200, // cachedTokens
420, // latencyMs
{ errorType: 'rate_limit' }, // options
);When cost is undefined, cost is auto-computed via the PricingEngine from the bundled LiteLLM pricing data. Pass errorType in options to mark the event for retry heuristic detection — it is stored in details.error_type.
Retry tracking
Flag a retry explicitly with markRetry. This creates a retry_marker event and increments the task's retryCount and retryCostUsd aggregates:
try {
const response = await openai.chat.completions.create({ ... });
} catch (err) {
task.markRetry('rate_limit', 0.001); // reason, optional cost
throw err;
}markNotRetry reverses a false-positive. When called without arguments it un-flags the most recent retry event; pass an eventId string to target a specific event:
task.markNotRetry(); // un-flag the most recent retry event
task.markNotRetry(event.eventId); // un-flag a specific eventHeuristic retry detection is opt-in via enableRetryHeuristics: true. When enabled, recordLlmCall checks recent events in the same task and automatically sets isRetry: true if a prior call for the same model failed with a transient error (rate_limit, timeout, 5xx, server_error, connection_error) within the configured sliding window.
Framework integrations
Express middleware
createExpressMiddleware wraps each incoming request in a tracked task and attaches the TrackedTask instance to req.dexcostTask so downstream route handlers can call recording methods directly:
import express from 'express';
import { CostTracker } from 'dexcost';
import { createExpressMiddleware } from 'dexcost';
const app = express();
const tracker = new CostTracker({ apiKey: 'dx_live_...' });
app.use(createExpressMiddleware(tracker, {
customerIdFrom: 'user.orgId', // dot-path into req
projectIdFrom: 'headers.x-project-id',
skip: (req) => (req as { path: string }).path === '/health',
}));
app.post('/chat', (req, res) => {
const task = (req as { dexcostTask: import('dexcost').TrackedTask }).dexcostTask;
task.recordCost('vector_db', 0.001);
res.json({ ok: true });
});The task type defaults to "METHOD /path". The task is ended automatically when the response emits finish — status is "success" for HTTP < 400, "failed" for HTTP ≥ 400.
ExpressMiddlewareOptions fields:
| Field | Type | Description |
|---|---|---|
customerIdFrom | string | Dot-path into req to extract a customer ID (e.g. "user.orgId"). |
projectIdFrom | string | Dot-path into req to extract a project ID. |
taskType | (req) => string | Custom function to derive the task type. Defaults to "METHOD /path". |
skip | (req) => boolean | Return true to skip tracking for a request (e.g. health checks). |
LangChain
DexcostCallbackHandler is a duck-typed LangChain callback handler — it matches BaseCallbackHandler's interface without inheriting from it, so @langchain/core is not a required dependency:
import { CostTracker, track, DexcostCallbackHandler } from 'dexcost';
import { ChatOpenAI } from '@langchain/openai';
const tracker = new CostTracker({ apiKey: 'dx_live_...' });
const handler = new DexcostCallbackHandler(tracker);
const llm = new ChatOpenAI({ model: 'gpt-4o', callbacks: [handler] });
await track({ taskType: 'lc_chain', customerId: 'acme-corp' }, async () => {
await llm.invoke('Summarise this document');
});The handler implements three lifecycle methods:
| Method | When called | What it records |
|---|---|---|
handleLLMStart(serialized, prompts, runId) | LLM invocation starts | Stores start time; extracts model from serialized.kwargs.model_name or the last element of serialized.id. |
handleLLMEnd(output, runId) | LLM invocation completes | Records llm_call event with tokens, computed cost, and latency. Token counts from output.llmOutput.tokenUsage.{promptTokens, completionTokens}. |
handleLLMError(error, runId) | LLM invocation fails | Records llm_call event with costUsd: 0, costConfidence: "unknown", and error_type in details. |
The handler requires an active task context when handleLLMEnd fires. If no task is active, the event is silently skipped.
Playwright browser adapter
trackBrowser wraps a block of Playwright work, measures wall-clock time, and records a compute_cost event proportional to session duration:
import { track, trackBrowser } from 'dexcost';
import { chromium } from 'playwright';
const browser = await chromium.launch();
const page = await browser.newPage();
await track({ taskType: 'scrape', customerId: 'acme' }, async () => {
await trackBrowser(page, async () => {
await page.goto('https://example.com');
await page.waitForSelector('h1');
}, { ratePerMinute: 0.01 });
});
await browser.close();trackBrowser accepts any object with a .url property — it does not import playwright directly, so no hard dependency is introduced.
TrackBrowserOptions:
| Field | Type | Default | Description |
|---|---|---|---|
ratePerMinute | number | 0.01 | Cost in USD per minute of browser session time. |
The compute_cost event is always recorded, even if the callback throws. The event details include wall_clock_seconds, rate_per_minute, and page_url at the time of recording.