Building a Streaming LLM Agent with the Laravel AI SDK

A raw language model is a closed book. It can write, reason, and explain, but it
only knows what it absorbed during training, and it can only ever produce text.
Ask it the current time, today's exchange rate, an exact arithmetic result, or
anything that changed last week, and it has two choices: admit it doesn't know,
or guess convincingly. Neither is what you want in a product.

The fix is to stop treating the model as an oracle and start treating it as the
reasoning core of a larger system — one that can take actions, see what comes
back, and reason again. That system is what we call an agent, and this guide
walks through building a real one: a streaming chat assistant that calls tools,
remembers conversations, and renders its answer token by token in the browser.

Agents, and the Laravel AI SDK

The intro called that system an agent. Here is the idea precisely. An LLM agent
is a language model wrapped in a loop that lets it act on the world instead of
answering in a single shot. The model is no longer just predicting the next token
of a reply — at each step it decides whether it has enough information to answer,
or whether it needs to reach for a tool.

The dominant pattern for this is ReAct — short for Reasoning +
Acting. Rather than answering all at once, the model works through a cycle:

Reason — think about what the question actually requires.
Act — if it needs outside information, call a tool (a function you provide) instead of guessing.
Observe — the tool runs and returns a result, which is fed back to the model as a new observation.
Repeat — the model reasons about that observation and either calls another tool or produces its final answer.

A worked example — "What time is it in Tokyo right now?":

User:    What time is it in Tokyo right now?

Reason:  I don't actually know the current time. I should call a clock tool.
Act:     current_datetime(timezone: "Asia/Tokyo")
Observe: Saturday, June 6, 2026 at 11:42 PM (Asia/Tokyo)
Reason:  I now have the real local time. I can answer.
Answer:  It's currently 11:42 PM on Saturday, June 6, 2026 in Tokyo.

The key move is that the model never fakes the time. It recognises the gap in
its knowledge, reaches for a tool, and reasons over the real result. That loop —
reason, act, observe, repeat — is the entire idea behind an agent.

Building one by hand, though, means solving the same set of problems every time:
speaking each provider's particular HTTP dialect, describing tools in a format the
model understands, parsing tool calls back out of the response, driving the
reason/act/observe loop, and threading conversation history through all of it. The
Laravel AI SDK (laravel/ai) is the first-party package that solves those
problems once, behind an API that feels like the rest of Laravel.

The shift it asks you to make is this: you stop writing procedural code that
calls an API, and you start describing an agent as a class. A system prompt, a
model, and a list of tools become declarations on a PHP class; the SDK reads that
class and does the orchestration. It speaks to every major provider — Anthropic,
OpenAI, Gemini, Groq, Ollama — through the same interface, so switching models is
a one-line change rather than a rewrite.

Until recently, that kind of tooling lived almost entirely in Python — LangChain,
LlamaIndex, the provider-native agent frameworks. For a Laravel team, adopting
them means standing up a second service in a second language and talking to it
over HTTP, which quietly puts your AI logic on the wrong side of a boundary: away
from your Eloquent models, your auth gates, your queue, and your test suite.

This is where the Laravel AI SDK is genuinely hard to beat — and it isn't about
cleverer abstractions. It's that the agent lives inside your application. A tool
is just PHP, so it can query Eloquent, respect a Gate, or dispatch a job directly.
Persistence is migrations and Eloquent models. Streaming rides Livewire. Tests use
the same Pest fakes as everything else. There's no glue service and no
serialization boundary to cross — the model's reasoning and your domain logic run
in the same process, with the same tools you already know.

Each thing an agent needs maps onto a concept the SDK gives you:

The agent itself is a class you prompt. It carries the instructions, the model choice, and the tools, and exposes methods like prompt() and stream().
A tool is a capability you grant that agent — a class with a name, a description, a parameter schema, and a handle() method that does the work.
Streaming lets you consume the reply as it forms, as a sequence of typed events (text fragments, tool calls) rather than one final blob.
Conversation memory persists each turn and replays prior history, so a multi-turn chat just works without you managing state by hand.

The single most important thing the SDK does, though, is run the agent loop for
you. You hand it tools and a step limit; it asks the model what it wants to do,
executes the chosen tool, feeds the result back, and repeats until the model has
a final answer. The whole ReAct cycle happens inside the SDK — you never write the
loop yourself.

It's built on Prism

The SDK doesn't reinvent provider communication from scratch. Underneath, it's
built on Prism, and the two relate the way
Eloquent relates to the Query Builder: Prism is the lower-level engine that
normalises providers and raw LLM calls, and the Laravel AI SDK is the
higher-level, opinionated framework on top — the layer that adds agents, tools,
memory, structured output, streaming, and testing helpers.

That layering is concrete, not just a metaphor: laravel/ai declares
prism-php/prism as a Composer dependency, so a call to $agent->stream()
ultimately drives Prism, which drives the provider's HTTP API:

Your app  →  Laravel AI SDK  →  Prism  →  Anthropic / OpenAI / …
            (agents, tools,     (provider
             memory, streaming)  normalisation)

The practical guidance follows directly from that: build on the Laravel AI SDK.
It's the layer meant for application code, and it's where everything in this guide
lives. Knowing Prism is underneath is useful — if you ever need something the SDK
hasn't surfaced yet, you can drop down to it directly, exactly as you'd
occasionally reach past Eloquent for the Query Builder.

Configuration

Stack: PHP 8.5 · Laravel 13 · Livewire 4 · Laravel AI SDK

Enough theory — let's wire it up, starting with the one piece of setup the SDK
can't infer: where to send requests, and with what key. Both live in your
environment. We'll use Anthropic throughout, but any supported provider works the
same way:

# .env
AI_PROVIDER=anthropic
ANTHROPIC_API_KEY=sk-ant-...

config/ai.php reads those values. The default provider falls back to
anthropic, and the Anthropic driver is wired to the API key:

// config/ai.php
'default' => env('AI_PROVIDER', 'anthropic'),

'providers' => [
    // ...
    'anthropic' => [
        'driver' => 'anthropic',
        'key' => env('ANTHROPIC_API_KEY'),
        'url' => env('ANTHROPIC_URL', 'https://api.anthropic.com/v1'),
    ],
    // ...
],

You rarely reference the string 'anthropic' in code, though. The SDK ships a
type-safe Lab enum (Laravel\Ai\Enums\Lab) that you attach to an agent — which
is exactly what we'll do next.

Building the agent

An agent is one small class. Read it once, then we'll break it down.

// ChatAgent.php
use App\Ai\Tools\Calculator;
use App\Ai\Tools\CurrentDateTime;
use App\Ai\Tools\WikipediaLookup;
use Laravel\Ai\Attributes\MaxSteps;
use Laravel\Ai\Attributes\Model;
use Laravel\Ai\Attributes\Provider;
use Laravel\Ai\Attributes\Temperature;
use Laravel\Ai\Concerns\RemembersConversations;
use Laravel\Ai\Contracts\Agent;
use Laravel\Ai\Contracts\Conversational;
use Laravel\Ai\Contracts\HasTools;
use Laravel\Ai\Enums\Lab;
use Laravel\Ai\Promptable;
use Laravel\Ai\Providers\Tools\WebSearch;
use Stringable;

#[Provider(Lab::Anthropic)]
#[Model('claude-sonnet-4-6')]
#[Temperature(0.7)]
#[MaxSteps(8)]
class ChatAgent implements Agent, Conversational, HasTools
{
    use Promptable, RemembersConversations;

    public function instructions(): Stringable|string
    {
        return <<<'PROMPT'
        You are a friendly, knowledgeable assistant that answers using the ReAct pattern:
        Reason about the question, Act by calling a tool when it helps, observe the
        result, then continue until you can give a clear final answer.

        Guidelines:
        - Think step by step, but keep your final answer concise and well formatted (Markdown).
        - Use the `calculator` tool for any arithmetic instead of computing in your head.
        - Use the `current_datetime` tool whenever the user asks about the current date or time.
        - Use the `wikipedia_lookup` tool for factual background on a specific topic, person, or place.
        - Use web search for recent events or anything that may have changed after your training.
        - If a tool fails or returns nothing useful, say so honestly rather than guessing.
        PROMPT;
    }

    public function tools(): iterable
    {
        return [
            new Calculator,
            new CurrentDateTime,
            new WikipediaLookup,
            (new WebSearch)->max(5),
        ];
    }
}

The attributes

PHP attributes configure the agent declaratively:

#[Provider(Lab::Anthropic)] — which provider to talk to, using the type-safe Lab enum rather than a magic string.
#[Model('claude-sonnet-4-6')] — the specific model.
#[Temperature(0.7)] — sampling randomness; 0.7 is a balanced default for a conversational assistant.
#[MaxSteps(8)] — the ReAct loop bound. Each "step" is one reason→act→observe cycle. With 8, the agent may call tools and reason up to eight times before it must produce a final answer. This is your safety valve against a model that loops forever calling tools.

The contracts and traits

class ChatAgent implements Agent, Conversational, HasTools
{
    use Promptable, RemembersConversations;

implements Agent — marks the class as an agent.
implements HasTools — declares that this agent exposes tools (via tools()).
implements Conversational — declares that this agent participates in persisted, multi-turn conversations.
use Promptable — adds the prompt() and stream() methods you call to run the agent.
use RemembersConversations — automatically persists each user/assistant turn and replays prior history so the model has context. Conversation memory, for free.

The system prompt

instructions() returns the system prompt, and it's doing two jobs at once: it
tells the model to follow the ReAct pattern, and it gives concrete guidance on
which tool to prefer for which kind of question. Good tool descriptions plus
clear prompt guidance are what make the model reach for the right tool at the
right time.

The tools

tools() returns the list the model is allowed to call. Three are custom classes
(Calculator, CurrentDateTime, WikipediaLookup); the fourth, WebSearch, is
a built-in provider tool shipped by the SDK — capped here to five results
with ->max(5). Let's write one.

Writing tools

A tool is any class implementing Laravel\Ai\Contracts\Tool. The contract is
four methods: name(), description(), schema(), and handle(). The model
reads the name, description, and schema to decide whether and how to call the
tool; handle() does the actual work and returns an observation string.

A minimal tool: the current date and time

The clock is the cleanest example of an "observation" tool — the model simply
cannot know the real current time, so it has to ask.

// CurrentDateTime.php
use Carbon\CarbonImmutable;
use Illuminate\Contracts\JsonSchema\JsonSchema;
use Laravel\Ai\Contracts\Tool;
use Laravel\Ai\Tools\Request;
use Stringable;

class CurrentDateTime implements Tool
{
    public function name(): string
    {
        return 'current_datetime';
    }

    public function description(): Stringable|string
    {
        return 'Get the current date and time. Optionally pass an IANA timezone '
            .'(e.g. "Asia/Tokyo", "America/New_York") to get the local time there.';
    }

    public function schema(JsonSchema $schema): array
    {
        return [
            'timezone' => $schema->string()
                ->description('An IANA timezone identifier such as "Europe/Paris". Defaults to UTC.'),
        ];
    }

    public function handle(Request $request): Stringable|string
    {
        $timezone = $request->string('timezone', 'UTC')->toString();

        if (! in_array($timezone, timezone_identifiers_list(), true)) {
            return "Unknown timezone \"{$timezone}\". Please use an IANA identifier like \"Asia/Tokyo\".";
        }

        $now = CarbonImmutable::now($timezone);

        return $now->format('l, F j, Y \a\t g:i A').' ('.$timezone.')';
    }
}

Worth noticing:

name() is the identifier the model uses when it decides to call the tool. Keep it short and snake_case.
description() is sales copy aimed at the model. The clearer you describe when to use the tool and what each parameter means, the more reliably the model calls it correctly.
schema() declares the parameters using a fluent JSON-schema builder. Here, timezone is an optional string (no ->required()), with its own description so the model knows to pass an IANA identifier.
handle(Request $request) receives the model's arguments as a Request object. $request->string('timezone', 'UTC') reads the argument with a default. The returned string is what the model "observes." Note that we validate the timezone and return a helpful message instead of throwing — the model can read that message and recover.

A tool that calls an external API

Tools can do real I/O. This one reaches out to Wikipedia's REST API and returns a
summary — and, importantly, it handles every failure path gracefully so the model
always gets a usable observation.

// WikipediaLookup.php
public function schema(JsonSchema $schema): array
{
    return [
        'topic' => $schema->string()
            ->description('The Wikipedia article title to summarize, e.g. "Great Barrier Reef".')
            ->required(),
    ];
}

public function handle(Request $request): Stringable|string
{
    $topic = $request->string('topic')->trim()->toString();

    if ($topic === '') {
        return 'No topic was provided to look up.';
    }

    try {
        $response = Http::acceptJson()
            ->withHeaders(['User-Agent' => 'LaravelAiSdkDemo/1.0 (tutorial)'])
            ->timeout(15)
            ->get('https://en.wikipedia.org/api/rest_v1/page/summary/'.rawurlencode($topic));
    } catch (Throwable $e) {
        return "Wikipedia lookup for \"{$topic}\" failed: {$e->getMessage()}";
    }

    if ($response->status() === 404) {
        return "No Wikipedia article was found for \"{$topic}\".";
    }

    if ($response->failed()) {
        return "Wikipedia lookup for \"{$topic}\" failed with status {$response->status()}.";
    }

    $extract = $this->jsonString($response, 'extract');

    if ($extract === '') {
        return "No summary is available for \"{$topic}\".";
    }

    $title = $this->jsonString($response, 'title', $topic);
    $url = $this->jsonString($response, 'content_urls.desktop.page');

    return trim("**{$title}**\n{$extract}".($url !== '' ? "\nSource: {$url}" : ''));
}

The pattern here is the lesson: a tool should never throw raw exceptions at
the model. A timeout, a 404, an empty body — each becomes a plain-English string
the model can reason about ("the lookup failed, I'll tell the user honestly"),
exactly as the system prompt instructs. Note that topic is ->required(), and
the tool returns lightly-formatted Markdown including a source link.

A tool that must be safe: the calculator

A calculator lets the model do exact arithmetic. The interesting part is what it
doesn't do — it never calls eval(). Instead it tokenizes the expression and
walks a tiny recursive-descent grammar that only understands numbers, the
operators + - * / % ^, parentheses, and unary minus. Anything else is rejected.

// Calculator.php
public function schema(JsonSchema $schema): array
{
    return [
        'expression' => $schema->string()
            ->description('The arithmetic expression to evaluate, e.g. "3 * (4 + 5)".')
            ->required(),
    ];
}

public function handle(Request $request): Stringable|string
{
    $expression = $request->string('expression')->toString();

    try {
        $result = $this->evaluate($expression);
    } catch (Throwable $e) {
        return "Could not evaluate \"{$expression}\": {$e->getMessage()}";
    }

    // Render integers without a trailing ".0" for nicer output.
    $formatted = $result == (int) $result
        ? (string) (int) $result
        : (string) $result;

    return "{$expression} = {$formatted}";
}

The takeaway: model-supplied input is untrusted. A naive eval($expression)
would be a remote code execution hole, because the model (or a user steering it)
controls that string. A private evaluate() method that parses safely is the
right call. The same principle applies any time a tool touches your filesystem,
your shell, or your database — treat the arguments as adversarial.

Built-in provider tools

Not every tool is yours to write. (new WebSearch)->max(5) from earlier is a
provider tool (Laravel\Ai\Providers\Tools\WebSearch) — the provider runs the
search natively. You compose it alongside your custom tools in the same list, and
the model treats them all the same way.

Streaming the answer to the browser

Tools and the agent loop are the brain; streaming is how the user actually
experiences it. Rather than blocking until the full answer is ready, you iterate
over the response and push fragments to the browser as they arrive.

Here's the heart of it — a component method that takes a prompt, runs the agent,
and streams the reply:

use App\Ai\Agents\ChatAgent;
use Laravel\Ai\Streaming\Events\TextDelta;
use Laravel\Ai\Streaming\Events\ToolCall;

public function send(string $prompt): void
{
    $prompt = trim($prompt);

    if ($prompt === '') {
        return;
    }

    // The ReAct loop streams over a long-lived request (tool calls, web
    // search, multi-step reasoning); lift the request time limit so the
    // response isn't truncated mid-stream by PHP's max_execution_time.
    set_time_limit(0);

    $agent = $this->conversationId === null
        ? ChatAgent::make()->forUser($this->user())
        : ChatAgent::make()->continue($this->conversationId, as: $this->user());

    $response = $agent->stream($prompt);

    $answer = '';

    foreach ($response as $event) {
        if ($event instanceof ToolCall) {
            $this->stream(to: 'status', content: 'Using '.$event->toolCall->name.'…', replace: true);
        } elseif ($event instanceof TextDelta) {
            // Render the accumulated answer as Markdown on each delta so the
            // live stream shows formatted text (not raw Markdown), matching
            // how the persisted message is rendered once the turn completes.
            $answer .= $event->delta;

            $this->stream(
                to: 'answer',
                content: Str::markdown($answer, ['html_input' => 'escape', 'allow_unsafe_links' => false]),
                replace: true,
            );
        }
    }

    // After the stream finishes, the SDK has persisted the conversation.
    $this->conversationId = $response->conversationId;

    unset($this->messages);

    $this->dispatch('conversation-updated', id: $this->conversationId);
}

Step by step:

set_time_limit(0) — a ReAct turn can involve several tool calls and a web
search, so it may outlast PHP's default max_execution_time. Lifting the
limit prevents the stream from being cut off mid-answer.
Starting vs continuing a conversation — ChatAgent::make() builds the
agent. ->forUser($user) starts a new conversation owned by that user;
->continue($conversationId, as: $user) resumes an existing one so the model
sees prior history. The component just tracks a conversationId.
$agent->stream($prompt) — runs the agent and returns an iterable stream
instead of a single blob. The Promptable trait provides this.
The event loop — iterating the response yields typed events:
- ToolCall — the model decided to act. Surface a small "Using calculator…" status so the user can see the agent reaching for a tool; $event->toolCall->name is the tool's name().
- TextDelta — a fragment of the final answer. Accumulate fragments into $answer, render the running total as Markdown, and push it to the answer target with replace: true (replace, not append, because we re-render the whole accumulated Markdown each time).
After the loop — because of RemembersConversations, the SDK has already
persisted both the user prompt and the assistant reply. Read the (possibly
brand-new) conversationId back off the response and refresh the UI.

Where the stream lands in the markup

Streaming to a named target (to: 'answer', to: 'status') maps onto regions in
the view via wire:stream:

<div x-show="streaming" x-cloak class="flex flex-col gap-2">
    <div x-ref="status" wire:stream="status" class="text-xs font-medium text-accent"></div>
    <div class="max-w-[90%] text-[15px]">
        <span x-show="!hasAnswer" class="inline-flex gap-1 align-middle">
            {{-- animated "typing" dots while we wait for the first token --}}
        </span>
        <div x-ref="answer" wire:stream="answer" class="reply" :class="hasAnswer && 'stream-caret'"></div>
    </div>
</div>

The wire:stream="answer" element receives each streamed update directly in the
browser — no full round-trip per token. A small amount of client-side JavaScript
handles the niceties: showing the user's question optimistically the instant they
hit send, disabling the composer while streaming, auto-scrolling as the reply
grows, and resetting on a conversation switch.

The result: the user sees a "Using calculator…" pill, then the answer typing
itself out live, then a clean persisted message — all from a single method.

Persisting conversations

Notice there was no persistence code in send() — the SDK handled it. Three
pieces make that work.

1. The user owns conversations. The User model uses the SDK's
HasConversations trait, which is what makes ->forUser($user) and
->continue(..., as: $user) work:

use Laravel\Ai\Concerns\HasConversations;

class User extends Authenticatable
{
    use HasConversations, HasFactory, Notifiable;
    // ...
}

2. The agent remembers. The RemembersConversations trait on the agent is
what actually writes each turn and replays history.

3. The schema. A migration extending Laravel\Ai\Migrations\AiMigration
creates two tables — one for conversations, one for messages:

return new class extends AiMigration
{
    public function up(): void
    {
        $conversationsTable = config('ai.conversations.tables.conversations', 'agent_conversations');
        $messagesTable = config('ai.conversations.tables.messages', 'agent_conversation_messages');

        Schema::create($conversationsTable, function (Blueprint $table) {
            $table->string('id', 36)->primary();
            $table->foreignId('user_id')->nullable();
            $table->string('title');
            $table->timestamps();

            $table->index(['user_id', 'updated_at']);
        });

        Schema::create($messagesTable, function (Blueprint $table) {
            $table->string('id', 36)->primary();
            $table->string('conversation_id', 36)->index();
            $table->foreignId('user_id')->nullable();
            $table->string('agent');
            $table->string('role', 25);
            $table->text('content');
            $table->text('attachments');
            $table->text('tool_calls');
            $table->text('tool_results');
            $table->text('usage');
            $table->text('meta');
            $table->timestamps();

            $table->index(['conversation_id', 'user_id', 'updated_at'], 'conversation_index');
            $table->index(['user_id']);
        });
    }
    // ...
};

Each message row stores the role (user / assistant / tool), the
content, and — useful for a UI — the tool_calls the assistant made. That's how
you can render a "calculator" pill next to a persisted answer: read the saved
tool_calls for that message. You query these through the SDK's
Laravel\Ai\Models\Conversation and Laravel\Ai\Models\ConversationMessage
Eloquent models.

Testing

A feature with this many moving parts — streaming, tool calls, persistence —
needs tests, and the obvious approach is the wrong one: hitting a real model in
tests would be slow, costly, and non-deterministic. Instead, the SDK provides a
fake() helper to swap the agent for a scripted response, plus assertions to
verify it was prompted correctly.

it('streams an answer and persists the conversation', function () {
    ChatAgent::fake(['Hello from the agent!']);

    $component = Livewire::test('chat.main', ['userId' => $this->user->id])
        ->call('send', 'Hi there')
        ->assertSee('Hi there')
        ->assertSee('Hello from the agent!')
        ->assertDispatched('conversation-updated');

    expect($component->get('conversationId'))->not->toBeNull();

    ChatAgent::assertPrompted('Hi there');

    expect(ConversationMessage::where('role', 'user')->count())->toBe(1)
        ->and(ConversationMessage::where('role', 'assistant')->count())->toBe(1);
});

What this verifies, end to end:

ChatAgent::fake(['Hello from the agent!']) makes the agent return a canned reply instead of calling the provider.
The flow renders both the user's prompt and the agent's reply, and dispatches the conversation-updated event.
ChatAgent::assertPrompted('Hi there') confirms the agent received the exact prompt.
Both the user and assistant messages were persisted — proving the RemembersConversations wiring works.

There's a matching assertNotPrompted() for the "ignore blank prompts" case, and
tools are best covered by fast unit tests that exercise their handle() logic and
error paths directly — no model needed. Run the suite with:

php artisan test --compact

Extending it: add your own tool

Once the agent is in place, growing its capabilities is a tight, three-step loop:

Create the tool. php artisan make:tool MyTool scaffolds a class. Implement name(), description(), schema(), and handle() — model the description/schema text carefully, since that's how the model decides to call it. Return plain strings, including for errors.
Register it in the agent's tools() method by adding new MyTool to the array.
Optionally mention it in the agent's instructions() so the model knows when to prefer it, and test it with a unit test plus a faked-agent feature test.

That's the entire workflow. The SDK handles the reasoning, the tool calls, the
streaming, and the persistence; you provide a well-described agent and a handful
of safe, honest tools — and the ReAct pattern does the rest.

Source code

Everything in this guide comes together in a complete, runnable application — a
minimal ChatGPT-style assistant with a streaming UI, the four tools shown above,
conversation history, and the full test suite. You can clone it, add your API
key, and have a working agent in a few minutes:

Tom Shaw / laravel-aisdk-demo

Concern	File
The agent	`app/Ai/Agents/ChatAgent.php`
Tools	`app/Ai/Tools/*.php`
Chat UI + streaming	`resources/views/components/chat/⚡main.blade.php`
Conversation owner	`app/Models/User.php`
Persistence schema	`database/migrations/..._create_agent_conversations_table.php`
Config	`config/ai.php`
Tests	`tests/Feature/ChatTest.php`, `tests/Unit/*`

Clone it, read the agent class first, then follow a single prompt through
send() and watch the reason/act/observe loop play out token by token.

Building a Streaming LLM Agent with the Laravel AI SDK