Build your agent
Tools & rich messages
Tools are server-side helpers your agent can invoke mid-turn. The visitor types something, the LLM decides "I need to call a tool for this", the tool runs on the server, and its result either feeds back into the LLM's final answer or surfaces directly in the widget as a structured block (e.g. a "Connect me with a human" button).
Tools are gated by the agent's vertical capabilities.
Each tool declares which capability it requires; the registry only
exposes a tool to agents whose capability list contains that slug.
Admins can further narrow the list with vertical_overrides.enabled_tools.
The shipping tool
| Tool | Required capability | Verticals | What it does |
|---|---|---|---|
escalate_to_human |
ticket_escalation |
help_center | Surfaces a "Connect me with a human" button (block type escalation_button) and a result the LLM can incorporate. Click triggers the existing lead-capture flow so an operator can claim the conversation. |
More tools land as later phases ship — lookup_product,
order_status, find_in_docs, book_demo,
and so on. Each tool is a single PHP class implementing
App\Services\Tools\Contracts\Tool.
How the hot path resolves tool calls
For every visitor turn on a tool-enabled agent,
MessageStreamController runs a small tool-resolution
loop before the streaming final answer:
-
Build the OpenAI-style
toolsarray from the registry'sforAgent($agent)result. -
Call
llm->chatWithTools(messages, tools)non-streaming. The model either returnstool_calls(it wants to invoke one or more tools) orcontent(it's ready to answer). -
If
tool_calls: emit atool_callSSE event for each invocation, run the tool'sexecute(), append the tool result to the message history as a{role: 'tool'}message, and loop. -
Once the model returns
content(or after 3 hops, whichever comes first), fall through to the existingstreamChatpath. The visitor still gets token-by-token streaming for the final answer, so TTFT is preserved. -
Any
blockpayloads the tools produced (e.g.escalation_button) are emitted asblockSSE events for the widget to render inline.
Provider compatibility
| Provider | Tool calling | Notes |
|---|---|---|
| OpenAI (gpt-4o-mini, gpt-4o) | Native | Full OpenAI tools array support via the SDK. |
| OpenRouter | Model-dependent | Tool-capable models (Claude 3.5, Llama 3.3 70B Hermes, etc.) work via the same OpenAI-compatible surface. |
| Cloudflare Workers AI | Model-dependent | Llama 3.3 70B Hermes and a handful of other models support function calling. Models without tool support gracefully degrade — they'll ignore the tools array and return content directly, so the loop simply exits. |
Widget rendering of blocks
The widget receives block SSE events during a turn and
attaches each block to the in-flight assistant message. The
renderer registry in resources/widget/src/ui/blocks.tsx
maps block type → Preact component. Unknown block types
are silently dropped (forward-compat for newer servers).
The widget's canRender(capability, agent) helper now
returns true when:
- the bundle ships a renderer for that capability, AND
-
the agent's server-resolved
capabilitiesarray opted in.
Both are required — the widget never enables a capability the server didn't authorize, and never tries to render a block whose renderer isn't in the bundle.
Adding a new tool
-
Implement
App\Services\Tools\Contracts\Toolinapp/Services/Tools/Tools/YourTool.php. Pick a uniquename(), write a cleardescription()(the LLM uses it to decide when to invoke), declare thecapability()slug it requires, and define theschema()JSON. -
Register the tool in
ToolRegistry::__construct. -
If your tool's
execute()returns ablockpayload, ship a renderer for it inui/blocks.tsxand add the relevant capability slug to theRENDERABLEset incapabilities.ts. -
Add a Pest unit test for the tool and a feature test for the
end-to-end flow using
FakeOpenAi::pushToolCall.
Latency considerations
The tool loop adds one non-streaming round-trip per hop before the streaming final answer kicks in. For a typical "needs one tool" turn that's roughly +200–500 ms of latency before the visitor sees the first token. The 99% case (no tools used) is unchanged because the registry returns an empty tool list for agents whose capabilities don't match any registered tool.
To keep latency manageable, write tool descriptions tightly so the LLM only invokes a tool when it really needs one. The hop limit (3) is a safety net — well-written tool descriptions should converge in 1 hop.