dexcost Documentation

Capture serverless (AWS Lambda) and GPU compute costs on a tracked task, plus a standalone Lambda cost estimator.

Beyond LLM and HTTP costs, the Rust SDK can capture the compute your agent burns — serverless invocations and GPU seconds — and attribute it to a tracked task. These costs land as compute_cost, gpu_cost, and gpu_utilization_signal events, with amounts in exact-precision Decimal.

How it works

Compute and GPU capture is opt-in and task-scoped. Each wrapper takes a &mut TrackedTask directly (there is no global wrap), times your async handler, attaches an accountant to the task, and persists the event with a deferred (cost_pending) cost that is back-filled when the task ends. The handler's Result is returned unchanged; the event is recorded on both the Ok and Err paths, because the compute was consumed either way.

Serverless compute (AWS Lambda)

wrap_lambda_handler wraps an async Lambda-style handler and emits one compute_cost event per invocation:

use dexcost::{start_task, TaskOptions, TaskStatus};
use dexcost::adapters::compute_wrap::wrap_lambda_handler;

let mut task = start_task("process_order", TaskOptions {
    customer_id: Some("acme-corp".into()),
    ..Default::default()
}).await?;

let result: Result<MyOutput, MyError> = wrap_lambda_handler(
    &mut task,
    event,                       // your Lambda event payload (T)
    serde_json::json!({}),       // request context as JSON
    |event, ctx| async move {
        // your handler work — LLM/API costs recorded on `task` are grouped in
        Ok(my_output)
    },
).await;

task.end(TaskStatus::Success).await?;

The accountant reads the Lambda environment (memory size, region, init type) to enrich the event. AWS Lambda is the only serverless runtime with a wrap handler in this release; the pricing engine recognizes other runtimes (Cloud Run, Cloud Functions, Azure Functions) but no wrap is shipped for them yet.

GPU

Three wrappers capture GPU usage for serverless GPU platforms, each emitting one gpu_cost event plus N gpu_utilization_signal events:

Platform	Wrapper (`dexcost::adapters::gpu_wrap`)
Modal	`wrap_modal_handler`
RunPod	`wrap_runpod_handler`
Replicate	`wrap_replicate_handler`

use dexcost::adapters::gpu_wrap::wrap_modal_handler;

let result = wrap_modal_handler(&mut task, payload, |payload| async move {
    // your GPU workload
    Ok(output)
}).await;

gpu_cost carries the priced GPU-seconds; gpu_utilization_signal is observability only (SM/memory utilization, VRAM) and always has a cost of 0, so it is never summed into totals.

Enabling real GPU measurement

The GPU wrappers compile and run on the default build, but actual NVML device readout requires the gpu Cargo feature, which pulls in the nvml-wrapper crate:

[dependencies]
dexcost = { version = "0.3", features = ["gpu"] }

Without the gpu feature, NVML reads are stubbed and the wrappers record no GPU measurement. At runtime, real capture additionally needs the NVIDIA driver and at least one NVIDIA GPU; on a host without them, nothing is recorded (the wrappers do not error). GPU capture is NVIDIA-only.

Standalone Lambda cost estimator

To estimate an AWS Lambda invocation's cost without running a wrapped handler, lambda_cost is a pure function:

use dexcost::adapters::lambda::{lambda_cost, supported_regions};

let cost = lambda_cost(1_500, 512, "us-east-1")?;  // duration_ms, memory_mb, region
println!("{}", cost.cost_usd);            // Decimal — total invocation cost
// cost.gb_seconds, cost.duration_cost_usd, cost.request_cost_usd, cost.rate_per_gb_second

let regions: Vec<String> = supported_regions();

It returns Err(LambdaCostError) for zero memory or an unknown region. It uses its own bundled Lambda rate table and is independent of the wrap_lambda_handler capture path.

Configuration & limitations

Opt-in and task-scoped. Nothing here runs unless you call a wrapper with a &mut TrackedTask.
Config::compute_billing_overrides is a HashMap<String, String>; the pricing engine recognizes {"cloud_run": "instance"} to switch Cloud Run to instance-based pricing.
Config::k8s_node_aware is reserved for Kubernetes node-aware accounting and currently has no effect.
No browser adapter exists in the Rust SDK (unlike Python/TypeScript).
Always-on runtime auto-capture (EC2, Fargate, GCE, Kubernetes pods) is not wired; the Lambda and GPU wraps are the capture paths available today.

Next steps

Configuration — Config fields including compute_billing_overrides and k8s_node_aware.
API Reference — event types and the full public API.

Compute & GPU