How does Lava handle multiple agent or LLM calls?

Metering Parallel and Multi-Step Agent Calls

Modern AI applications often make multiple LLM calls for a single user action — running parallel agents, doing A/B model comparisons, or orchestrating multi-step workflows. Here's how Lava handles this.

Multiple Meters for Different Call Types

In Lava, you create meters for each type of usage you want to track. If your application makes multiple types of calls, you can:

  • Meter all calls — Create a meter for each call type, and all usage is tracked and billed
  • Meter only billable calls — Create a meter only for the calls you want to charge customers for, and let non-billable calls run without a meter
  • Meter all, bill selectively — Create meters for everything (for visibility) but only attach revenue-generating meters to the customer's plan

Example: A/B Model Testing

Say you run two LLM calls in parallel to compare results — one to GPT-4o and one to Claude. You only serve one result to the customer.

  • Both calls go through the gateway, so Lava tracks usage on both
  • You only attach the "primary" meter to the customer's plan, so they're only billed for the call you serve
  • The second call still shows up in your analytics, so you have full visibility into your actual costs

Example: Multi-Step Agent Workflows

An AI agent might make 50 LLM calls to complete a single investigation or task. You have two options:

  • Bill per underlying call — Each LLM call is metered. The customer pays based on actual token consumption. Good for transparency but can feel unpredictable.
  • Bill per action — Create a single meter for "investigation completed" with a fixed price. The underlying calls are still tracked for your cost analysis, but the customer sees one clean charge per action.

Real-Time Balance Checking

Importantly, Lava checks the wallet balance before each call through the gateway. If a multi-step agent workflow is running and the customer's credits run out mid-workflow, Lava will block the next call. This prevents you from accumulating costs for a customer who can't pay.

You can handle this in your application logic — check the response, and if credits are insufficient, prompt the customer to top up or gracefully wind down the workflow.

Cost Tracking Across Models

The Lava dashboard breaks down usage by model, so you can see exactly which models are driving costs across all your customers. This is especially valuable when your agents use multiple models — you get visibility into the real cost structure of your product.

Did this answer your question? Thanks for the feedback There was a problem submitting your feedback. Please try again later.