Get started with Vantra
Add full observability to any Python AI agent in under 5 minutes. No changes to your agent logic required.
Quickstart
Install the SDK
pip install vantraGet your API key
Go to Settings → API Keys and create a key.
Add 3 lines to your agent
import vantra
vantra.init(
api_key="van_live_...",
project="my-agent"
)
@vantra.trace
def run_agent(message: str):
# your existing code — completely untouched
return agent.run(message)Run your agent and open the dashboard
Every trace appears in your Traces dashboard within seconds.
Installation
Requires Python 3.8+. Install via pip:
pip install vantraOr with a specific version:
pip install vantra==0.1.0vantra.init()
Call once at the start of your application, before any traced functions run.
vantra.init(
api_key="van_live_...", # required
project="my-agent", # optional — groups traces in dashboard
)| Parameter | Type | Required | Description |
|---|---|---|---|
| api_key | str | Yes | Your Vantra API key from Settings |
| project | str | No | Project name shown in the dashboard |
@vantra.trace
Decorator that wraps a function as a root trace. Every call creates a new trace in the dashboard with timing, status, and any nested spans.
@vantra.trace
def run_agent(message: str) -> str:
result = call_llm(message)
return result
# Also works on async functions
@vantra.trace
async def run_agent_async(message: str) -> str:
result = await call_llm_async(message)
return resultvantra.span()
Context manager for creating child spans inside a trace. Use this to instrument individual steps — tool calls, retrievals, chains.
@vantra.trace
def run_agent(message: str) -> str:
with vantra.span("search_knowledge", kind="tool") as span:
results = search(message)
span.set_output({"results": results})
with vantra.span("generate_response", kind="llm") as span:
response = llm.chat(message, context=results)
span.set_output({"response": response})
return response| Parameter | Type | Description |
|---|---|---|
| name | str | Span name shown in the waterfall |
| kind | str | "llm" | "tool" | "retrieval" | "chain" | "agent" |
OpenAI auto-patch
Vantra automatically patches the OpenAI client after vantra.init() is called. Every chat.completions.create call is captured as an LLM span — including tokens, cost, model, and latency.
import vantra
import openai
vantra.init(api_key="van_live_...", project="my-agent")
client = openai.OpenAI()
@vantra.trace
def ask(question: str) -> str:
# This call is automatically captured — no extra code needed
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": question}]
)
return response.choices[0].message.contentAnthropic auto-patch
Same as OpenAI — Anthropic's messages.create is automatically captured.
import vantra
import anthropic
vantra.init(api_key="van_live_...", project="my-agent")
client = anthropic.Anthropic()
@vantra.trace
def ask(question: str) -> str:
response = client.messages.create(
model="claude-opus-4-5",
max_tokens=1024,
messages=[{"role": "user", "content": question}]
)
return response.content[0].textFull example
A complete support agent with nested spans, tool calls, and automatic LLM capture:
import vantra
import openai
vantra.init(api_key="van_live_...", project="support-agent")
client = openai.OpenAI()
def search_knowledge_base(query: str) -> list[str]:
# your retrieval logic
return ["relevant doc 1", "relevant doc 2"]
def send_email(to: str, body: str) -> bool:
# your email logic
return True
@vantra.trace
def handle_support_ticket(ticket: dict) -> str:
# Step 1: classify
with vantra.span("classify", kind="llm"):
classification = client.chat.completions.create(
model="gpt-4o-mini",
messages=[
{"role": "system", "content": "Classify this support ticket."},
{"role": "user", "content": ticket["message"]},
]
)
category = classification.choices[0].message.content
# Step 2: search
with vantra.span("search_kb", kind="tool") as span:
docs = search_knowledge_base(ticket["message"])
span.set_output({"docs": docs, "count": len(docs)})
# Step 3: respond
with vantra.span("generate_response", kind="llm"):
response = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": f"You are a support agent. Context: {docs}"},
{"role": "user", "content": ticket["message"]},
]
)
reply = response.choices[0].message.content
# Step 4: send
with vantra.span("send_email", kind="tool"):
send_email(ticket["email"], reply)
return reply
if __name__ == "__main__":
handle_support_ticket({
"message": "My payment isn't going through",
"email": "user@example.com",
})FAQ
Does Vantra add latency to my agent?
No. All trace data is sent in a background thread with a queue. Your agent function returns immediately — Vantra never blocks the main thread.
What happens if the Vantra API is down?
Spans are queued in memory and retried. If they fail after retries, they are silently dropped — your agent keeps running either way.
Does it work with LangChain or LlamaIndex?
Yes. Use @vantra.trace on your top-level chain or agent function, and vantra.span() around individual steps. The LLM calls inside are auto-patched.
Is my prompt data sent to Vantra?
Yes — inputs and outputs are captured so you can inspect them in the trace waterfall. Payloads over 2KB are truncated. Contact us if you need input/output capture disabled.
How do I get my API key?
Go to Settings → API Keys → Create key. Keys start with van_live_.