OpenInference: Standard LLM Span Kinds & Attributes
Defines 10 span kinds (LLM, AGENT, TOOL, etc.) and 60+ reserved attributes for inputs, outputs, tokens, costs to standardize OpenTelemetry tracing of LLM apps, chains, retrievers, and agents.
Classify Operations with 10 Required Span Kinds
Every OpenInference span requires openinference.span.kind to categorize the operation, helping backends assemble traces correctly. Use these exact values:
- LLM: Traces calls to models like OpenAI chat completions or Llama text generation.
- EMBEDDING: Captures embedding generation, e.g., OpenAI ada for retrieval.
- CHAIN: Marks entry points or links between steps, like passing retriever context to LLM.
- RETRIEVER: Logs data fetches from vector stores or databases.
- RERANKER: Tracks reranking inputs by relevance scores, returning top K via cross-encoders.
- TOOL: Records external calls like calculators or weather APIs invoked by LLMs.
- AGENT: Wraps LLM-guided tool reasoning blocks.
- GUARDRAIL: Monitors jailbreak protection, modifying/rejecting unsafe LLM outputs.
- EVALUATOR: Measures LLM output quality like relevance or correctness.
- PROMPT: Tracks prompt template rendering with variables.
Set this attribute on all spans to enable visualization of execution graphs via graph.node.id, graph.node.name, and graph.node.parent_id.
Track Inputs, Outputs, and Documents Uniformly
Populate spans with these reserved attributes for consistency across SDKs:
- Documents:
document.content(string),document.id(string/int),document.metadata(JSON),document.score(float, e.g., 0.98). - Embeddings:
embedding.model_name(e.g., "BERT-base"),embedding.text(input),embedding.vector(float list),embedding.embeddings(list of objects),embedding.invocation_parameters(JSON excluding input). - Inputs/Outputs:
input.value(string),input.mime_type(e.g., "text/plain"),output.value,output.mime_type. - Messages:
llm.input_messagesandllm.output_messagesas flattened lists, e.g.,llm.input_messages.0.message.role="user",llm.input_messages.0.message.content="hello". Supports multimodal viamessage.contents.0.message_content.type="image",message_content.image.url. - Prompts/Completions (legacy):
llm.prompts.0.prompt.text,llm.choices.0.completion.text. - Retriever/Reranker:
retrieval.documents(list),reranker.query,reranker.top_k(int, e.g., 3),reranker.input_documents/output_documents. - Tools:
llm.tools.0.tool.json_schema(full schema),tool.name,tool.description. For calls:message.tool_calls.0.tool_call.id="call_62136355",tool_call.function.name="get_weather",tool_call.function.arguments(JSON). - Sessions/Users:
session.id,user.id(UUIDs).
Use metadata (JSON) for extras, tag.tags (string list) for categorization.
Monitor Tokens, Costs, and Model Details
Capture usage precisely for optimization:
- Tokens:
llm.token_count.prompt(int, e.g., 10),llm.token_count.completion(15),llm.token_count.total(20). Granular:prompt_details.cache_read(OpenAI cached_tokens, e.g., 5),prompt_details.cache_write(Anthropic misses),prompt_details.audio/completion_details.audio/reasoning. - Costs (USD floats):
llm.cost.prompt(0.0021),llm.cost.completion(0.0045),llm.cost.total(0.0066). Details likeprompt_details.cache_read(0.0003),completion_details.reasoning(0.0024).
Identify models: llm.model_name (e.g., "gpt-3.5-turbo"), llm.system (well-known: "openai", "anthropic", "cohere", etc.), llm.provider (e.g., "azure", "groq"). For embeddings, use embedding.model_name only—no llm.system/provider.
Exceptions: exception.type, exception.message, exception.stacktrace, exception.escaped (bool).
Flatten Nested Lists for OpenTelemetry
Convert lists/objects to flat keys with zero-based indexing: {base}.{index}.{nested.path}.
Examples:
- Messages:
llm.input_messages.0.message.role="user". - Multimodal:
llm.input_messages.0.message.contents.1.message_content.image.url. - Tools:
llm.tools.0.tool.json_schema="{...}",llm.output_messages.0.message.tool_calls.0.tool_call.function.arguments="{...}".
Code snippets: Python:
for i, obj in enumerate(messages):
for key, value in obj.items():
span.set_attribute(f"llm.input_messages.{i}.{key}", value)
JS/TS:
const messages = [...];
for (const [i, obj] of messages.entries()) {
for (const [key, value] of Object.entries(obj)) {
span.setAttribute(`llm.input_messages.${i}.${key}`, value);
}
}
Flatten until simple types (str, int, float, bool, lists thereof).