How does Lava handle multiple agent or LLM calls?
Metering Parallel and Multi-Step Agent Calls
Modern AI applications often make multiple LLM calls for a single user action — running parallel agents, doing A/B model comparisons, or orchestrating multi-step workflows. Here's how Lava handles this.
Multiple Meters for Different Call Types
In Lava, you create meters for each type of usage you want to track. If your application makes multiple types of calls, you can:
- Meter all calls — Create a meter for each call type, and all usage is tracked and billed
- Meter only billable calls — Create a meter only for the calls you want to charge customers for, and let non-billable calls run without a meter
- Meter all, bill selectively — Create meters for everything (for visibility) but only attach revenue-generating meters to the customer's plan
Example: A/B Model Testing
Say you run two LLM calls in parallel to compare results — one to GPT-4o and one to Claude. You only serve one result to the customer.
- Both calls go through the gateway, so Lava tracks usage on both
- You only attach the "primary" meter to the customer's plan, so they're only billed for the call you serve
- The second call still shows up in your analytics, so you have full visibility into your actual costs
Example: Multi-Step Agent Workflows
An AI agent might make 50 LLM calls to complete a single investigation or task. You have two options:
- Bill per underlying call — Each LLM call is metered. The customer pays based on actual token consumption. Good for transparency but can feel unpredictable.
- Bill per action — Create a single meter for "investigation completed" with a fixed price. The underlying calls are still tracked for your cost analysis, but the customer sees one clean charge per action.
Real-Time Balance Checking
Importantly, Lava checks the wallet balance before each call through the gateway. If a multi-step agent workflow is running and the customer's credits run out mid-workflow, Lava will block the next call. This prevents you from accumulating costs for a customer who can't pay.
You can handle this in your application logic — check the response, and if credits are insufficient, prompt the customer to top up or gracefully wind down the workflow.
Cost Tracking Across Models
The Lava dashboard breaks down usage by model, so you can see exactly which models are driving costs across all your customers. This is especially valuable when your agents use multiple models — you get visibility into the real cost structure of your product.