dexcost

Multi-tenant SaaS

Attribute AI costs to individual customers using per-request context.

In a multi-tenant application, every LLM call and service cost must be attributed to the customer that triggered it. dexcost uses per-request context — set once in middleware and inherited by all downstream operations in that request.

How it works

  1. A request arrives carrying a customer identifier (JWT claim, header, session).
  2. Middleware calls set_context / setContext with that customer's ID.
  3. All track(), record_cost(), and auto-instrumented LLM calls in that request inherit the context automatically.
  4. No customer ID needs to be threaded through your business logic.

The context is scoped to the current async task / thread — concurrent requests never leak context to each other.

Framework examples

FastAPI

import dexcost
from fastapi import FastAPI, Request

dexcost.init(api_key="dx_live_...")

app = FastAPI()

@app.middleware("http")
async def cost_context(request: Request, call_next):
    # Extract customer from JWT or session
    customer_id = request.state.user.customer_id  # your auth layer
    dexcost.set_context(
        customer_id=customer_id,
        project_id="api-v2",
    )
    response = await call_next(request)
    dexcost.clear_context()
    return response

Django

import dexcost

class CostContextMiddleware:
    def __init__(self, get_response):
        self.get_response = get_response

    def __call__(self, request):
        if hasattr(request, "user") and request.user.is_authenticated:
            dexcost.set_context(
                customer_id=str(request.user.customer_id),
                project_id="django-app",
            )
        response = self.get_response(request)
        dexcost.clear_context()
        return response

Add to MIDDLEWARE in settings.py:

MIDDLEWARE = [
    ...
    "myapp.middleware.CostContextMiddleware",
]

Express (TypeScript)

import express from "express";
import { init, setContext, clearContext } from "dexcost";

init({ apiKey: "dx_live_..." });

const app = express();

app.use((req, res, next) => {
  const customerId = (req as any).user?.customerId;
  setContext({ customerId, projectId: "express-api" });
  res.on("finish", () => clearContext());
  next();
});

Gin (Go)

func DexcostMiddleware() gin.HandlerFunc {
    return func(c *gin.Context) {
        customerID := c.GetString("customer_id") // from your auth middleware
        ctx := dexcost.SetContext(c.Request.Context(), customerID, "gin-api")
        c.Request = c.Request.WithContext(ctx)
        c.Next()
    }
}

// Register:
// r := gin.Default()
// r.Use(DexcostMiddleware())

Axum (Rust)

pub async fn cost_middleware(mut req: Request, next: Next) -> Response {
    let customer_id = req
        .extensions()
        .get::<AuthUser>()
        .map(|u| u.customer_id.clone());

    let ctx = set_context(DexcostContext {
        customer_id,
        project_id: Some("axum-api".into()),
        ..Default::default()
    }).await;

    req.extensions_mut().insert(ctx);
    next.run(req).await
}

Grouping costs into tasks

Wrap each user-triggered operation in a task so the dashboard shows per-operation cost breakdowns, not just raw events:

@app.post("/tickets/{ticket_id}/resolve")
async def resolve_ticket(ticket_id: str, request: Request):
    # context already set by middleware
    with dexcost.task("resolve_ticket") as t:
        context = await fetch_context(ticket_id)       # auto-task event
        t.record_cost("pinecone", 0.004)               # explicit event
        response = await openai.chat.completions.create(
            model="gpt-4o",
            messages=build_messages(context),
        )                                              # auto-captured event
    return {"reply": response.choices[0].message.content}

Async safety

Context uses Python's contextvars (Python), AsyncLocalStorage (Node.js), or context propagation via context.Context (Go). This means:

  • Each request handler runs in its own context copy.
  • Background tasks and spawned coroutines inherit the context at spawn time.
  • Two concurrent requests for different customers never share or overwrite each other's context.
import asyncio
import dexcost

async def handle_customer_a():
    dexcost.set_context(customer_id="customer-a")
    await asyncio.sleep(0.1)  # yield to event loop
    ctx = dexcost.get_context()
    assert ctx.customer_id == "customer-a"  # always true

async def handle_customer_b():
    dexcost.set_context(customer_id="customer-b")
    await asyncio.sleep(0.1)
    ctx = dexcost.get_context()
    assert ctx.customer_id == "customer-b"  # never "customer-a"

# Both run concurrently — no cross-contamination
await asyncio.gather(handle_customer_a(), handle_customer_b())

Viewing per-customer costs

Once customer IDs are flowing through context:

  • Dashboard → Customers shows cost, tokens, and retry cost per customer.
  • Dashboard → Profitability shows margin per customer if you've configured revenue, ranked by spend.

See the Dashboard guide and API Reference for how to query this data programmatically.

On this page