import traces

traces May 11, 2026 2 min read

to train from your agent’s behavior, you first pull its traces out of your observability platform. an adapter connects to your provider, fetches the traces, and normalizes them into a consistent format, so the rest of the pipeline works the same regardless of where your traces came from.

quickstart

connect to your provider and fetch a project’s traces:

from benchmax.traces.braintrust.adapter import BraintrustTraceAdapter

adapter = BraintrustTraceAdapter(api_key="bt_...")
adapter.connect()  # validates credentials

projects = adapter.list_projects()
traces, cursor = adapter.fetch_traces(project_id=projects[0].id, limit=500)

braintrust is the only built-in provider today. to connect another source, implement the TraceAdapter protocol and register it in benchmax.traces.registry.

fetching all traces

fetch_traces is paginated. loop with the returned cursor to pull an entire project:

all_traces = []
cursor = None
while True:
    batch, cursor = adapter.fetch_traces(project_id=projects[0].id, limit=100, cursor=cursor)
    all_traces.extend(batch)
    if cursor is None:
        break

supported message formats

the adapter normalizes each message in a trace into the standard form below, auto-detecting the common provider shapes:

  • openai: tool calls in the standard nested form (tool_calls[].function.{name, arguments}), the flat form (tool_calls[].{name, arguments}), and the legacy function call
  • anthropic / openclaw: structured content blocks (text and toolCall)
  • role and field aliases: toolResulttool, toolCallIdtool_call_id, toolNamename

NormalizedTrace format

all adapters return NormalizedTrace objects with a consistent structure:

fieldtypedescription
idstrunique trace identifier
messageslist[TraceMessage]the conversation (system, user, assistant, tool messages)
scoresdict[str, float]provider-reported scores (e.g. task success, accuracy)
metadatadictprovider-specific metadata (task ID, model, timestamp)
errorslist[str]any extraction errors encountered

each TraceMessage has role, content, and optionally tool_calls (list of ToolCall with name, arguments, id).

next steps

once you have traces, process them into training examples.