BaseChatClient interface. You pick a client, pass it into ReActAgent, and switch providers later by changing the import and the model name; the agent code does not move.
Supported providers
| Class | Provider | API key env var |
|---|---|---|
OpenAIChatClient | OpenAI, and any OpenAI-compatible server (Ollama, vLLM, …) | OPENAI_API_KEY |
AnthropicChatClient | Anthropic | ANTHROPIC_API_KEY |
GeminiChatClient | Google (Gemini Developer API or Vertex AI) | GEMINI_API_KEY |
OpenRouterChatClient | OpenRouter (multi-provider routing) | OPENROUTER_API_KEY |
api_key. They all also accept arbitrary **kwargs that are forwarded to the underlying provider SDK (timeout, max_retries, default_headers, and so on).
Creating a client
- OpenAI
- Anthropic
- Gemini
- OpenRouter
Local models
OpenAIChatClient works with any OpenAI-compatible server. Point base_url at your local service:
Prompt caching
AnthropicChatClient supports Anthropic’s prompt caching. Set cache_policy on the agent; see Prompt caching on the Agents page for the full table of options and TTLs. On providers that do not implement prompt caching (OpenAI, Gemini, OpenRouter), cache_policy is a no-op.
Reasoning
Models with extended thinking (Opus 4.6, Sonnet 4.6, and others) are controlled by thereasoning parameter on the agent. See Reasoning on the Agents page for ReasoningConfig.auto(), effort=, budget_tokens=, and ReasoningConfig.disabled().
Message and completion types
The two types every client reads and writes.ReActAgent handles them for you, so most of the time you only need to construct them when you write a custom agent or call a client by hand.
ChatMessage
The unified message format that every client reads and writes. Use the factory methods for each role:
user_message and assistant_message accept an optional base64_image for vision inputs.
ChatCompletion
The return value of client.create() and client.parse(). The fields a caller usually reads:
| Field | Type | What it is |
|---|---|---|
content | str | None | Text response |
tool_calls | list[ToolCall] | None | Tool calls the model requested |
reasoning | str | None | Readable chain of thought (when the model emits one) |
reasoning_details | list[dict] | None | Provider-specific reasoning blocks, passed back on follow-up calls so the model can continue its thinking |
finish_reason | str | "stop", "tool_calls", or "length" |
usage | dict | Token counts |
parsed | Any | None | Parsed Pydantic object (populated by parse()) |
id / model | str / str | Response ID and model identifier |
completion.to_message() to turn a completion into a ChatMessage you can append to conversation history.
Calling a client directly
Every client implements two async methods.ReActAgent calls these for you; you only reach for them when building a custom agent or running a one-off completion.
| Method | What it does |
|---|---|
create(model, messages, tools=None, reasoning=..., **kwargs) | Standard chat completion. Returns ChatCompletion. |
parse(model, messages, response_format, tools=None, reasoning=..., **kwargs) | Structured output. The completion’s parsed field holds an instance of response_format. |

