Compute & GPU
Capture serverless and GPU compute costs on a tracked task, plus a standalone Lambda cost estimator.
Beyond LLM and HTTP costs, the Go SDK can capture the compute your agent burns — serverless invocations and GPU seconds — and attribute it to a tracked task. These costs land as compute_cost, gpu_cost, and gpu_utilization_signal events, with amounts in exact-precision decimal.Decimal.
How it works
Compute and GPU capture is opt-in and context-scoped. Each Wrap*Handler function takes a handler func(context.Context, T) (R, error) and returns a wrapped handler of the same shape. At call time it reads the task from the context (GetCurrentTask(ctx)): if a task is present it times the handler, attaches an accountant, and persists the event with a deferred (cost_pending) cost that is back-filled when the task ends; if no task is in the context, the wrapper is a transparent pass-through and records nothing. The event is recorded on both the success and error paths.
So the pattern is always: start a task (which returns a task-carrying context), then invoke the wrapped handler with that context.
Serverless compute
Five wrappers cover the serverless platforms, each emitting one compute_cost event per invocation:
| Platform | Wrapper |
|---|---|
| AWS Lambda | dexcost.WrapLambdaHandler |
| Google Cloud Run | dexcost.WrapCloudRunHandler |
| Google Cloud Functions | dexcost.WrapCloudFunctionsHandler |
| Azure Functions | dexcost.WrapAzureFunctionsHandler |
| Vercel | dexcost.WrapVercelHandler |
import (
"context"
dexcost "github.com/DexwoxBusiness/dexcost-sdk/go"
)
// Your business logic: a handler of the shape the wrapper expects.
func handler(ctx context.Context, event MyEvent) (MyResult, error) {
// LLM/API costs recorded on the task in ctx are grouped with the compute cost
return MyResult{OK: true}, nil
}
func main() {
dexcost.Init(dexcost.Config{APIKey: "dx_live_..."})
defer dexcost.Close()
wrapped := dexcost.WrapLambdaHandler(handler)
ctx, task := dexcost.StartTask(context.Background(), "process_order",
dexcost.WithCustomer("acme-corp"))
defer task.End(dexcost.StatusSuccess)
// Pass the task-carrying ctx so the compute_cost event is attributed to the task.
result, err := wrapped(ctx, MyEvent{ /* ... */ })
_ = result
_ = err
}The Lambda wrapper reads the Lambda environment (memory size, region, init type) to enrich the event; the pricing engine recognizes the other runtimes and back-fills cost at task finalize.
GPU
Three wrappers capture GPU usage for serverless GPU platforms, each emitting one gpu_cost event plus N gpu_utilization_signal events:
| Platform | Wrapper |
|---|---|
| Modal | dexcost.WrapModalGPUHandler |
| RunPod | dexcost.WrapRunpodGPUHandler |
| Replicate | dexcost.WrapReplicateGPUHandler |
wrapped := dexcost.WrapModalGPUHandler(gpuHandler)
ctx, task := dexcost.StartTask(context.Background(), "image_generation",
dexcost.WithCustomer("acme-corp"))
defer task.End(dexcost.StatusSuccess)
result, err := wrapped(ctx, payload)gpu_cost carries the priced GPU-seconds; gpu_utilization_signal is observability only (SM/memory utilization, VRAM) and always has a cost of 0, so it is never summed into totals.
Enabling GPU measurement (NVMLBackend)
The Go SDK does not read GPU hardware out of the box. To stay dependency-free and CGO-free, it talks to NVML through a pluggable interface, core.NVMLBackend, whose default implementation is a no-op that reports "no GPU." With the default backend the GPU wrappers run but record nothing.
To get real measurements you register a backend that implements core.NVMLBackend:
import "github.com/DexwoxBusiness/dexcost-sdk/go/core"
core.SetNVMLBackend(myNVMLBackend) // your implementation of the NVMLBackend interfaceYour backend supplies device count, product name, per-process utilization, and memory info (e.g. by wrapping a Go NVML binding or parsing nvidia-smi). This design keeps the SDK building and testing on GPU-less hosts with no NVIDIA dependency, and is the main way the Go GPU story differs from the Python and Rust SDKs.
Standalone Lambda cost estimator
To estimate an AWS Lambda invocation's cost without running a wrapped handler, adapters.LambdaCost is a pure function:
import "github.com/DexwoxBusiness/dexcost-sdk/go/adapters"
res, err := adapters.LambdaCost(1500, 512, "us-east-1") // durationMs, memoryMb, region
// res.CostUSD (decimal.Decimal), res.Details.GBSeconds, res.Details.DurationCostUSD, ...
regions, _ := adapters.GetSupportedLambdaRegions()It returns an error for an unknown region, a negative durationMs, or a non-positive memoryMb. It uses its own bundled Lambda rate table and is independent of the WrapLambdaHandler capture path.
Browser sessions
For agents that drive a headless browser, adapters.StartBrowserSession / adapters.TrackBrowser record session wall-clock time as a compute_cost event (default rate $0.01/minute). They are documented in full under Instrumentation → Browser adapter.
Configuration & limitations
- Opt-in and context-scoped. Nothing here runs unless you wrap a handler and call it with a task-carrying context.
Config.ComputeBillingOverridesis amap[string]string; the pricing engine recognizes{"cloud_run": "instance"}to switch Cloud Run to instance-based pricing.Config.K8sNodeAwareis reserved for Kubernetes node-aware accounting and currently has no effect.- GPU is no-op by default. Register a
core.NVMLBackendto capture real GPU data; there is no built-in NVIDIA dependency. - Always-on runtime auto-capture (EC2, Fargate, GCE, Kubernetes pods) is not wired; the serverless and GPU wraps plus the browser adapter are the capture paths available today.
Next steps
- Configuration —
Configfields includingComputeBillingOverridesandK8sNodeAware. - API Reference — event types and the full public API.
Instrumentation & Integrations
How dexcost captures LLM costs via wrapper clients, records non-LLM HTTP costs, and integrates with Gin, Echo, net/http, langchaingo, and browser automation.
Configuration
All Config fields, environment variables, development mode, attribution context helpers, and the dexcost CLI.