Teju's Blog

Full stack engineer and AI architect. Notes from the work.


Building a ReAct agent in Go, with a React UI to watch it work

Most agent tutorials are written in Python. Most agent demos are CLI tools that scroll text past you and leave you to read the log. I wanted something different. A ReAct-pattern agent where Go runs the reasoning loop and a React UI shows the work as it happens, animated, so you can actually watch the thing think.

Yes, two kinds of React in the same sentence. ReAct (the AI pattern, Yao et al, 2022) for the backend, and React.js for the frontend. I am going to write ReAct and React for the rest of the post and pretend that is not confusing.

The pattern, in one paragraph

ReAct interleaves Reasoning and Acting. The model thinks out loud, picks a tool, the tool runs, the model sees the result, and it thinks again. Repeat until it has an answer or runs out of step budget. That is the whole pattern. If you have ever watched a senior engineer debug something, it looks identical: form a hypothesis, run a check, look at the result, update the hypothesis.

In a sequence:

par [concurrent calls] loop [until done or budget hit] task messages + tool schemas thinking + tool calls call(args) result final answer messages += tool results User Agent (Go) LLM Tool

Why Go for the loop, why React for the trace

Three honest reasons for Go.

First, concurrency. Tool calls are independent. When the model asks for three things at once (search this, read that, hit that API), Go runs them with a sync.WaitGroup and a goroutine per call. The whole step ends when the slowest tool returns. Python can do this too with asyncio, but you pay for it in lines and in shaped exceptions.

Second, deployment. One static binary on a small box behind a reverse proxy. No virtualenv, no Dockerfile longer than five lines. Boring is the point. If the agent has to run somewhere with patchy internet for half the day, you can scp the binary and forget about it.

Third, streaming. The standard library’s http.ResponseWriter is fine for Server Sent Events once you remember to Flush. No framework. The handler is twenty lines.

React for the trace is more of a taste call. The trace is a sequence of cards appearing in order, sliding others down, sometimes expanding to show a tool’s output. That is exactly what React with motion (formerly framer-motion) is built for. You can do the same in Svelte or vanilla JS. I am not the right person to argue that fight.

The architecture:

POST /run events SSE React UI + motion.dev Go HTTP server + SSE Agent loop LLM provider Tool: web search Tool: file read Tool: shell exec

The agent loop in Go

I am skipping the LLM client wrapper because it is not the interesting bit. Assume llm.Chat(ctx, messages, tools) returns a response with optional tool calls. Here is the loop:

go
type Agent struct {
    llm   LLM
    tools map[string]Tool
    max   int          // step budget; saved me at least one $40 bill
    out   chan<- Event // streams to the HTTP handler
}

func (a *Agent) Run(ctx context.Context, task string) error {
    msgs := []Message{{Role: "user", Content: task}}
    schemas := toolSchemas(a.tools)

    for step := 0; step < a.max; step++ {
        resp, err := a.llm.Chat(ctx, msgs, schemas)
        if err != nil {
            return fmt.Errorf("llm: %w", err)
        }
        if resp.Thinking != "" {
            a.out <- Event{Type: "thinking", Content: resp.Thinking}
        }
        if len(resp.ToolCalls) == 0 {
            a.out <- Event{Type: "answer", Content: resp.Content}
            return nil
        }
        msgs = append(msgs, resp.Message)
        results := a.runTools(ctx, resp.ToolCalls)
        msgs = append(msgs, toolResultsMessage(results))
    }
    return errors.New("step budget exceeded")
}

Three things worth pointing out.

The step budget. Given enough rope, a model will call tools in a loop forever. A budget of about twenty is enough for most real tasks and cheap enough that you can ignore the cost. If you hit it, log the trace and look at it. Do not raise the limit.

The out channel. The agent has no idea it is being watched. It writes events to a channel and the HTTP handler reads them and writes SSE frames. The agent stays testable without HTTP, which matters a lot once you start writing evals.

The message order. The assistant message goes back into the conversation, followed by a single user-role message containing every tool result. Some providers want one message per tool result; some want a list. Read the docs once, then forget about it.

Running tools concurrently

This is the bit where Go pays for itself:

go
func (a *Agent) runTools(ctx context.Context, calls []ToolCall) []ToolResult {
    results := make([]ToolResult, len(calls))
    var wg sync.WaitGroup
    for i, call := range calls {
        wg.Add(1)
        go func(i int, call ToolCall) {
            defer wg.Done()
            a.out <- Event{Type: "tool_call", ID: call.ID, Name: call.Name, Args: call.Args}
            out, err := a.tools[call.Name].Execute(ctx, call.Args)
            a.out <- Event{Type: "tool_result", ID: call.ID, Output: out, Err: errString(err)}
            results[i] = ToolResult{ID: call.ID, Output: out, Err: err}
        }(i, call)
    }
    wg.Wait()
    return results
}

Yes, you should add a per-tool timeout. Yes, you should cancel siblings when the request context is cancelled. Both are about five more lines and I left them out to keep this readable. The shape is the part that matters.

Streaming events to the browser

The HTTP handler runs the agent in a goroutine and pumps events from the channel out as SSE:

go
func (s *Server) handleRun(w http.ResponseWriter, r *http.Request) {
    w.Header().Set("Content-Type", "text/event-stream")
    w.Header().Set("Cache-Control", "no-cache")
    flusher, ok := w.(http.Flusher)
    if !ok {
        http.Error(w, "streaming not supported", http.StatusInternalServerError)
        return
    }

    var body struct {
        Task string `json:"task"`
    }
    if err := json.NewDecoder(r.Body).Decode(&body); err != nil {
        http.Error(w, err.Error(), http.StatusBadRequest)
        return
    }

    events := make(chan Event, 32)
    go func() {
        defer close(events)
        if err := s.newAgent(events).Run(r.Context(), body.Task); err != nil {
            events <- Event{Type: "error", Content: err.Error()}
        }
    }()

    for ev := range events {
        b, _ := json.Marshal(ev)
        fmt.Fprintf(w, "event: %s\ndata: %s\n\n", ev.Type, b)
        flusher.Flush()
    }
    fmt.Fprint(w, "event: done\ndata: {}\n\n")
    flusher.Flush()
}

Notice that the agent gets a fresh events channel per request. Sharing channels across requests is one of those things that works fine until two people use your app at once.

The React UI

The UI shows events as they arrive and animates the transitions so the user can see the agent thinking. Motion (motion.dev) handles the animation. The rest is a small hook and a list.

The hook subscribes to SSE:

tsx
import { useEffect, useState } from 'react';

export type AgentEvent =
  | { id: string; type: 'thinking'; content: string }
  | { id: string; type: 'tool_call'; name: string; args: unknown }
  | { id: string; type: 'tool_result'; output: unknown; err?: string }
  | { id: string; type: 'answer'; content: string }
  | { id: string; type: 'error'; content: string };

export function useAgentStream(task: string) {
  const [events, setEvents] = useState<AgentEvent[]>([]);
  const [status, setStatus] = useState<'idle' | 'running' | 'done' | 'error'>('idle');

  useEffect(() => {
    if (!task) return;
    setEvents([]);
    setStatus('running');

    const src = new EventSource(`/api/agent/run?task=${encodeURIComponent(task)}`);
    const push = (e: MessageEvent) => {
      const data = JSON.parse(e.data);
      setEvents(prev => [...prev, { id: crypto.randomUUID(), ...data }]);
    };

    for (const t of ['thinking', 'tool_call', 'tool_result', 'answer', 'error'] as const) {
      src.addEventListener(t, push);
    }
    src.addEventListener('done', () => { setStatus('done'); src.close(); });
    src.onerror = () => { setStatus('error'); src.close(); };
    return () => src.close();
  }, [task]);

  return { events, status };
}

And the view:

tsx
import { AnimatePresence, motion } from 'motion/react';

export function Trace({ task }: { task: string }) {
  const { events, status } = useAgentStream(task);
  return (
    <div className="trace">
      <AnimatePresence initial={false}>
        {events.map(ev => (
          <motion.div
            key={ev.id}
            layout
            initial={{ opacity: 0, y: 12 }}
            animate={{ opacity: 1, y: 0 }}
            exit={{ opacity: 0, y: -8 }}
            transition={{ type: 'spring', stiffness: 380, damping: 30 }}
            className={`step step-${ev.type}`}
          >
            <Step event={ev} />
          </motion.div>
        ))}
      </AnimatePresence>
      {status === 'running' && <Pulse />}
    </div>
  );
}

The single most important prop here is layout. When a new step pushes the others up, motion animates the slide instead of cutting. That is what makes the page feel alive. Without it you have a chat log; with it you have something that looks like the agent is doing the work in front of you.

The Step component switches on ev.type and renders accordingly. Small icon on the left, thinking content in italics, tool calls as collapsible JSON, the final answer in a card with a soft glow. None of that is interesting code; if you want a starting point, look at how Vercel’s AI SDK renders tool calls and steal from there.

A real run

The question I have been testing with: “What was the weather in Bangalore last week, and is that unusual for May?”

The agent’s trace, edited for length:

think:   I need current weather data plus a historical average. Two parallel calls.
call:    weather_history(city="Bangalore", start="2026-05-15", end="2026-05-22")
call:    weather_normals(city="Bangalore", month="May")
result:  { mean: 25.4C, max: 32.1C, rainfall: 18mm }
result:  { mean_normal: 27.2C, rainfall_normal: 110mm }
think:   Cooler than normal, much drier. Let me confirm with one more source.
call:    web_search("Bangalore May 2026 monsoon delay")
result:  [news links about weak pre-monsoon showers]
think:   Consistent picture.
answer:  Last week ran about 1.8C below the May normal, with 18mm of rain
         against an average of 110mm. Locals would call it unusual: pre-monsoon
         showers have been weak, which several news outlets noted this week.

Two things to notice. The first reasoning step asks for both data calls in the same response, which means they run in parallel in Go. The second think step is one line and decides to grab one more source. That is the whole point of ReAct: the model gets to course-correct mid-task instead of committing to a plan it cannot revise.

What I would do differently

A short list, because the hour is late.

  • Use structured tool calls, not JSON-in-text. Every modern provider has a tool-use API. Hand-rolled JSON parsing breaks on the day the model decides to add a trailing comma. Ask me how I know.
  • A wall-clock budget on top of the step budget. Twenty steps that each take two seconds is forty seconds. Twenty steps where each one calls an upstream API with a P99 of ten seconds is enough to time out the entire request.
  • Persist runs. Log the full event stream to a small SQLite table. You will want to replay a flaky run, and the model never produces the same trace twice.
  • Hide the filler thoughts. “Let me think about this” is not useful for a human reader. Show only thinking steps that contain a verb or a number. Cheap heuristic, big quality win.
  • Cancel everything on disconnect. When the user closes the tab, propagate the cancellation through r.Context() so in-flight tool calls stop. Otherwise you keep paying for them.

← all posts