Curriculum/Day 5: AI Agents & Orchestration

Day 5Build AI Products

AI Agents & Orchestration

State machines have predefined transitions. Agents decide autonomously. You'll implement multiple agent patterns (ReAct, plan-and-execute, tool-use), add safety guardrails against infinite loops and cost runaway, and build a research agent that gathers, summarizes, and compares information.

90 min(+35 min boss)★★★★☆

🤖

Bridge:State machines / workflowsAgent loops + orchestration

Use this at work tomorrow

Build a Slack bot agent that answers questions by searching your docs, Jira, and codebase.

Learning Objectives

1Implement the ReAct loop: reason → act → observe → repeat
2Build plan-and-execute agents that decompose complex tasks
3Add safety guardrails: max iterations, cost budgets, timeout limits
4Handle agent failure modes: infinite loops, hallucinated actions, cost explosions
5Ship a research agent that automates real information gathering

Ship It: Research automation agent

By the end of this day, you'll build and deploy a research automation agent. This isn't a toy — it's a real project for your portfolio.

Before You Start — Rate Your Confidence

I can build an AI agent with a ReAct loop, choose the right agent pattern for a task, and implement guardrails to prevent runaway costs and infinite loops.

1 = no idea · 5 = ship it blindfolded

Predict First — Then Learn

What makes an AI agent different from a hardcoded workflow?

Agents = Autonomous Loops

State machines transition between predefined states that YOU define. Agents are autonomous loops — the LLM observes the current state, reasons about what to do, takes an action (tool call), observes the result, and repeats until the task is done. It's a dynamic state machine where the LLM is the transition function.

💡Agents = while loops where the LLM decides the next action. Dynamic state machines with the LLM as the transition function.

Quick Pulse Check

In an agent loop, who decides the next action?

🔄 Agent Loop Animator

Watch the Think → Act → Observe loop in action — this is how agents reason

Question: What is the Vercel AI SDK?

Step: 0/5Iterations: 0Max: 10

💭 Think

🔧 Act: searchWeb

👁️ Observe

💭 Think

✅ Act: Return answer

Think Act Observe

Predict First — Then Learn

You're building a research bot that needs to search, read pages, and write a report. Which pattern?

Agent Patterns: ReAct, Plan-and-Execute, Tool-Use

ReAct (Reason + Act) is the simplest: think → act → observe → repeat. Plan-and-Execute decomposes the task first, then executes steps. Tool-Use agents are simpler — just give the LLM tools and let it call them iteratively. Each pattern suits different complexity levels. ReAct for simple tasks, Plan-and-Execute for complex multi-step work.

💡Tool-Use for simple tasks, ReAct for transparent reasoning, Plan-and-Execute for complex multi-step workflows.

Quick Pulse Check

Which agent pattern creates a full plan before executing anything?

⚖️ Agent Pattern Comparator

Compare ReAct, Plan-and-Execute, and Tool-Use side by side

🔧 Tool-Use

Give LLM tools, let it call them

Complexity: LowAutonomy: Low

Flow

User prompt→LLM reads tools→Picks & calls tool→Gets result→Calls another or answers

Pseudocode

const { text } = generateText({
  model: "gpt-4o",
  tools: { search, calc, email },
  maxSteps: 5,
  prompt: userMessage,
});

✅ Strengths

• Simple to implement
• Low latency — fewer LLM calls
• Works for 80% of use cases

⚠️ Weaknesses

• No explicit reasoning trace
• Can make poor tool choices
• Struggles with complex multi-step tasks

Best for: Chatbots with tools, simple automations, single-step tasks

🔄 ReAct

Think → Act → Observe → Repeat

Complexity: MediumAutonomy: Medium

Flow

Think: analyze task→Act: call tool→Observe: check result→Think: is this enough?→Act or Answer

Pseudocode

while (step < maxSteps) {
  thought = LLM("What should I do?")
  action = LLM("Which tool?")
  observation = execute(action)
  if (LLM("Am I done?")) break;
}

✅ Strengths

• Transparent reasoning trace
• Self-correcting (reasons about failures)
• Good for debugging

⚠️ Weaknesses

• More tokens per step (explicit thoughts)
• Can get stuck in reasoning loops
• Slower — more LLM calls

Best for: Research tasks, debugging, any task needing explainability

Predict First — Then Learn

Your agent runs 15 steps and costs $2.50 for a simple question. What guardrail failed?

Safety Guardrails: Preventing Agent Disasters

Agents can go catastrophically wrong: infinite loops (keeps calling tools forever), cost explosions ($50 in API calls for one query), hallucinated actions (calling tools that don't exist with wrong parameters), and unsafe actions (deleting data it shouldn't). Always implement: max iteration limits, cost budgets per request, action allowlists, and human-in-the-loop for destructive actions.

💡Always implement: max iterations (10), cost budget ($0.50/request), loop detection, timeout, and action allowlists. No exceptions.

Quick Pulse Check

An agent calls searchWeb('react hooks') three times in a row. What guardrail catches this?

🛡️ Guardrail Dashboard

Monitor agent safety limits in real-time — see what happens when guardrails trigger

Agent completes research in 4 steps, under budget

Loop Prevention

🔁 Iterations4 / 10 steps

Agent steps taken vs max allowed

🔄 Repeat Detection0 / 2 repeats

Same tool+args called more than once

Cost Control

🪙 Token Budget3200 / 10000 tokens

Tokens consumed vs budget limit

💰 Cost0.04 / 0.5 USD

Estimated API cost for this run

Safety

⏱️ Timeout3.2 / 30 seconds

Total execution time

Quality

👻 Hallucinated Tools0 / 1 calls

Agent tried to call non-existent tools

All guardrails passed. Agent completed efficiently.

Agent Failure Modes: What Goes Wrong

The most common agent failures: (1) Infinite loops — agent keeps trying the same failed approach. Fix: max iterations + detect repeated actions. (2) Cost runaway — agent calls expensive tools many times. Fix: token/cost budget per conversation. (3) Hallucinated tools — agent tries to call tools that don't exist. Fix: strict tool schema validation. (4) Context overflow — agent accumulates too many tool results. Fix: summarize tool results before adding to context.

💡Top 4 failures: infinite loops, cost runaway, hallucinated tools, context overflow. Each needs a specific guardrail.

The Full Evolution

Watch one function evolve through every concept you just learned.

🔄 Code Evolution — One Function, Five Stages

Step 1: Raw fetch()

The SWE starting point

Raw fetch, manual headers, raw text output

1async function reviewCode(code: string) {
2  const response = await fetch(
3    "https://api.openai.com/v1/chat/completions",
4    {
5      method: "POST",
6      headers: {
7        "Authorization": `Bearer ${API_KEY}`,
8        "Content-Type": "application/json",
9      },
10      body: JSON.stringify({
11        model: "gpt-4o-mini",
12        messages: [
13          { role: "user", content: `Review: ${code}` }
14        ],
15      }),
16    }
17  );
18  const data = await response.json();
19  return data.choices[0].message.content;
20  // Returns raw text — unparseable!
21}

1 / 5

Production Gotchas

Always set maxSteps (Vercel AI SDK) or max iterations. 10 steps is a good default. Log every agent step — you'll need the trace to debug failures. Agents are expensive: a 5-step agent call uses 5x the tokens. Use cheaper models (GPT-4o-mini) for simple agent tasks, expensive models (GPT-4o) only when reasoning quality matters. Consider multi-agent architectures: a 'planner' agent decides what to do, 'worker' agents execute individual tools.

Code Comparison

State Machine vs Agent Loop

Predetermined flow vs autonomous decision-making

State MachineTraditional

// Traditional state machine
// YOU define all states and transitions
type State =
  | "INIT" | "SEARCH" | "ANALYZE"
  | "SUMMARIZE" | "DONE";
let state: State = "INIT";

while (state !== "DONE") {
  switch (state) {
    case "INIT":
      await gatherRequirements();
      state = "SEARCH";
      break;
    case "SEARCH":
      const data = await searchDB(query);
      state = "ANALYZE";
      break;
    case "ANALYZE":
      const insight = analyze(data);
      state = "SUMMARIZE";
      break;
    case "SUMMARIZE":
      await createReport(insight);
      state = "DONE";
      break;
  }
}
// Fixed path: INIT→SEARCH→ANALYZE→DONE
// Can't adapt if search returns nothing

Agent Loop (ReAct)AI Engineering

// AI Agent loop — LLM decides each step
import { generateText } from "ai";

const { text } = await generateText({
  model: openai("gpt-4o"),
  tools: {
    searchWeb: searchTool,
    analyzeData: analyzeTool,
    writeReport: reportTool,
    askUser: clarifyTool,
  },
  maxSteps: 10,       // Safety limit
  system: `You are a research agent.
Given a research question:
1. Search for relevant information
2. Analyze what you find
3. If results are insufficient,
   search with different terms
4. Write a summary report
Always cite your sources.`,
  prompt: userQuestion,
});
// Dynamic: might search 3 times,
// ask user for clarification,
// skip analysis if data is clear
// Adapts to what it finds!

KEY DIFFERENCES

State machines: YOU define all states and transitions upfront
Agents: the LLM decides the next action based on what it observes
Both loop until completion — agents just have dynamic transitions
maxSteps prevents infinite loops — always set this in production

Bridge Map: State machines / workflows → Agent loops + orchestration

Click any bridge to see the translation

Hands-On Challenges

Build, experiment, and get AI-powered feedback on your code.

starter

Build a ReAct Agent Loop

Implement a simple ReAct (Reason + Act) agent that takes a research question, uses tools to gather information, and produces a final answer. The agent should loop: think → call tool → observe result → think again → repeat until it has enough info.

PLAYGROUND

import { useState } from "react";

// Simulated tools
const tools: Record<string, (arg: string) => Promise<string>> = {
  search: async (query: string) => {
    await new Promise(r => setTimeout(r, 500));
    const results: Record<string, string> = {
      "react server components": "React Server Components render on the server, reduce client JS bundle. Available since Next.js 13. Default in App Router.",
      "vercel ai sdk": "Vercel AI SDK provides generateText(), streamText(), generateObject() for working with LLMs. Supports OpenAI, Anthropic, Google.",
      "rag pipeline": "RAG = Retrieval-Augmented Generation. Steps: chunk docs, embed, store in vector DB, retrieve on query, generate answer with context.",
    };
    const key = Object.keys(results).find(k => query.toLowerCase().includes(k));
    return key ? results[key] : "No results found for: " + query;
  },
  calculate: async (expr: string) => {
    await new Promise(r => setTimeout(r, 200));
    try {
      const result = new Function("return " + expr.replace(/[^0-9+\-*/().\s]/g, ""))();
      return String(result);
    } catch {
      return "Error: cannot evaluate expression";
    }
  },
};

interface AgentStep {
  thought: string;
  action?: { tool: string; input: string };
  observation?: string;
}

// TODO: Implement the ReAct agent
// The agent should:
// 1. Think about what to do (generate a "thought")
// 2. Choose a tool and input (or decide to give a final answer)
// 3. Execute the tool and observe the result
// 4. Repeat until it has enough info (max 5 iterations)
// 5. Return the final answer and all steps

async function runAgent(question: string, maxSteps: number = 5): Promise<{
  answer: string;
  steps: AgentStep[];
}> {
  const steps: AgentStep[] = [];

  // Simulated agent reasoning — in production, this would be LLM calls
  // For this challenge, implement the loop structure:
  
  // TODO: Loop up to maxSteps times:
  //   1. Determine what tool to call based on the question and previous observations
  //   2. Call the tool
  //   3. Record the step (thought, action, observation)
  //   4. If the observation answers the question, break

  return { answer: "Implement me!", steps };
}

export default function App() {
  const [question, setQuestion] = useState("What is the Vercel AI SDK?");
  const [result, setResult] = useState<{ answer: string; steps: AgentStep[] } | null>(null);
  const [running, setRunning] = useState(false);

  async function handleRun() {
    setRunning(true);
    setResult(null);
    const res = await runAgent(question);
    setResult(res);
    setRunning(false);
  }

  return (
    <div style={{ padding: 20, fontFamily: "sans-serif" }}>
      <h2>🤖 ReAct Agent</h2>
      <p style={{ color: "#666", fontSize: 13 }}>Watch the agent think → act → observe → repeat</p>
      <div style={{ display: "flex", gap: 8 }}>
        <input value={question} onChange={(e) => setQuestion(e.target.value)}
          style={{ flex: 1, padding: "8px 12px", borderRadius: 6, border: "1px solid #ddd", fontSize: 13 }}
        />
        <button onClick={handleRun} disabled={running}
          style={{ padding: "8px 16px", background: running ? "#94a3b8" : "#0ea5e9", color: "white", border: "none", borderRadius: 6, cursor: "pointer" }}>
          {running ? "Thinking..." : "Run Agent"}
        </button>
      </div>

      {result && (
        <div style={{ marginTop: 16 }}>
          <h3 style={{ fontSize: 14, marginBottom: 8 }}>Agent Trace ({result.steps.length} steps):</h3>
          {result.steps.map((step, i) => (
            <div key={i} style={{ margin: "8px 0", padding: 10, background: "#f8fafc", borderRadius: 6, border: "1px solid #e2e8f0", fontSize: 13 }}>
              <div>💭 <strong>Thought:</strong> {step.thought}</div>
              {step.action && (
                <div style={{ marginTop: 4 }}>🔧 <strong>Action:</strong> {step.action.tool}({step.action.input})</div>
              )}
              {step.observation && (
                <div style={{ marginTop: 4, padding: 6, background: "#f0fdf4", borderRadius: 4 }}>
                  👁️ <strong>Observation:</strong> {step.observation}
                </div>
              )}
            </div>
          ))}
          <div style={{ marginTop: 12, padding: 12, background: "#eff6ff", borderRadius: 8, border: "1px solid #bfdbfe" }}>
            <strong>Final Answer:</strong> {result.answer}
          </div>
        </div>
      )}
    </div>
  );
}

Open Sandbox

stretch

Agent with Safety Guardrails

Build on the starter agent and add production safety guardrails: max iterations, cost tracking (simulated token counting), loop detection (detect if agent keeps trying the same tool), and timeout protection. These are the exact guardrails you need for production agents.

PLAYGROUND

import { useState } from "react";

const tools: Record<string, (arg: string) => Promise<string>> = {
  search: async (query: string) => {
    await new Promise(r => setTimeout(r, 400));
    const results: Record<string, string> = {
      "ai agents": "AI agents are autonomous systems that use LLMs to reason, plan, and execute tasks using tools.",
      "react hooks": "React Hooks let you use state and lifecycle in functional components. useState, useEffect, useContext are the most common.",
      "machine learning": "ML is a subset of AI where systems learn patterns from data instead of being explicitly programmed.",
    };
    const key = Object.keys(results).find(k => query.toLowerCase().includes(k));
    return key ? results[key] : "No results found for: " + query;
  },
  analyze: async (text: string) => {
    await new Promise(r => setTimeout(r, 300));
    return `Analysis: The text contains ${text.split(" ").length} words and discusses ${text.includes("AI") ? "AI topics" : "general topics"}.`;
  },
};

interface AgentStep {
  thought: string;
  action?: { tool: string; input: string };
  observation?: string;
  tokensUsed: number;
}

interface AgentConfig {
  maxSteps: number;
  maxTokenBudget: number; // Simulated token budget
  timeoutMs: number;
}

interface AgentResult {
  answer: string;
  steps: AgentStep[];
  totalTokens: number;
  stoppedReason: "completed" | "max_steps" | "budget_exceeded" | "timeout" | "loop_detected";
}

// TODO: Implement the agent with guardrails
// 1. Max iterations (maxSteps) — stop after N steps
// 2. Token budget — each step "costs" ~150 tokens, stop if budget exceeded
// 3. Loop detection — if agent calls same tool with same input twice, stop
// 4. Timeout — if total time exceeds timeoutMs, stop
async function runSafeAgent(
  question: string,
  config: AgentConfig = { maxSteps: 5, maxTokenBudget: 1000, timeoutMs: 10000 }
): Promise<AgentResult> {
  const steps: AgentStep[] = [];
  let totalTokens = 0;
  const startTime = Date.now();

  // TODO: Implement the loop with all 4 guardrails

  return { answer: "Implement me!", steps, totalTokens, stoppedReason: "completed" };
}

export default function App() {
  const [question, setQuestion] = useState("Tell me about AI agents");
  const [result, setResult] = useState<AgentResult | null>(null);
  const [running, setRunning] = useState(false);
  const [config, setConfig] = useState({ maxSteps: 5, maxTokenBudget: 600, timeoutMs: 10000 });

  async function handleRun() {
    setRunning(true);
    setResult(null);
    const res = await runSafeAgent(question, config);
    setResult(res);
    setRunning(false);
  }

  return (
    <div style={{ padding: 20, fontFamily: "sans-serif" }}>
      <h2>🛡️ Safe Agent with Guardrails</h2>
      <div style={{ display: "flex", gap: 12, margin: "8px 0", fontSize: 12 }}>
        <label>Max Steps: <input type="number" value={config.maxSteps} onChange={e => setConfig(c => ({ ...c, maxSteps: +e.target.value }))}
          style={{ width: 50, padding: 2 }} /></label>
        <label>Token Budget: <input type="number" value={config.maxTokenBudget} onChange={e => setConfig(c => ({ ...c, maxTokenBudget: +e.target.value }))}
          style={{ width: 60, padding: 2 }} /></label>
      </div>
      <div style={{ display: "flex", gap: 8 }}>
        <input value={question} onChange={(e) => setQuestion(e.target.value)}
          style={{ flex: 1, padding: "8px 12px", borderRadius: 6, border: "1px solid #ddd", fontSize: 13 }}
        />
        <button onClick={handleRun} disabled={running}
          style={{ padding: "8px 16px", background: running ? "#94a3b8" : "#8b5cf6", color: "white", border: "none", borderRadius: 6, cursor: "pointer" }}>
          {running ? "Running..." : "Run Agent"}
        </button>
      </div>

      {result && (
        <div style={{ marginTop: 16, fontSize: 13 }}>
          <div style={{ display: "flex", gap: 12, marginBottom: 8, flexWrap: "wrap" }}>
            <span style={{ padding: "2px 8px", borderRadius: 4, background: result.stoppedReason === "completed" ? "#dcfce7" : "#fef3c7", fontSize: 12 }}>
              {result.stoppedReason}
            </span>
            <span style={{ fontSize: 12, color: "#666" }}>Steps: {result.steps.length} | Tokens: {result.totalTokens}</span>
          </div>
          {result.steps.map((step, i) => (
            <div key={i} style={{ margin: "6px 0", padding: 8, background: "#f8fafc", borderRadius: 6, border: "1px solid #e2e8f0" }}>
              <div>💭 {step.thought}</div>
              {step.action && <div style={{ marginTop: 2 }}>🔧 {step.action.tool}("{step.action.input}")</div>}
              {step.observation && <div style={{ marginTop: 2, color: "#166534" }}>👁️ {step.observation}</div>}
              <div style={{ fontSize: 11, color: "#94a3b8", marginTop: 2 }}>Tokens: {step.tokensUsed}</div>
            </div>
          ))}
          <div style={{ marginTop: 12, padding: 12, background: "#eff6ff", borderRadius: 8, border: "1px solid #bfdbfe" }}>
            <strong>Answer:</strong> {result.answer}
          </div>
        </div>
      )}
    </div>
  );
}

Open Sandbox

Real-World Challenge

Research Automation Agent

Build and deploy an autonomous research agent that takes a topic, searches the web for information, reads and extracts content from multiple sources, and produces a structured research summary with citations. This is real AI agent engineering.

~4h estimated

Next.js 14+Vercel AI SDKOpenAI GPT-4oSerper API or Wikipedia APITailwind CSSVercel (deploy)

Acceptance Criteria

Accept a research topic or question as input
Implement 3+ agent tools: web search, page reading, note-taking
Use the ReAct loop pattern (reason → act → observe → repeat) with maxSteps
Add safety guardrails: max iterations, cost budget, timeout limits
Show the agent's reasoning trace in real-time (which tools it called and why)
Produce a structured research report with sourced findings
Deploy to a public URL (Vercel, Netlify, etc.)

Build Roadmap

0/6

Create a new Next.js app with TypeScript and Tailwind CSS. Plan the agent architecture: tools, loop control, and output format.

npx create-next-app@latest research-agent --typescript --tailwind --app

Create separate files for tools, agent logic, and the UI

Deploy Tip

Push to GitHub and import into Vercel. Set OPENAI_API_KEY and any search API keys in Vercel environment variables. Add rate limiting on the endpoint — agent calls can be expensive.

After Learning — Rate Your Confidence Again

I can build an AI agent with a ReAct loop, choose the right agent pattern for a task, and implement guardrails to prevent runaway costs and infinite loops.

1 = no idea · 5 = ship it blindfolded

Day 4: Function Calling & Tool Chains

Day 6: Multimodal AI & Streaming UX

AI Agents & Orchestration

Learning Objectives

Ship It: Research automation agent

Agents = Autonomous Loops

🔄 Agent Loop Animator

Agent Patterns: ReAct, Plan-and-Execute, Tool-Use

⚖️ Agent Pattern Comparator

🔧 Tool-Use

Flow

Pseudocode

✅ Strengths

⚠️ Weaknesses

🔄 ReAct

Flow

Pseudocode

✅ Strengths

⚠️ Weaknesses

Safety Guardrails: Preventing Agent Disasters

🛡️ Guardrail Dashboard

Loop Prevention

Cost Control

Safety

Quality

Agent Failure Modes: What Goes Wrong

The Full Evolution

🔄 Code Evolution — One Function, Five Stages

Step 1: Raw fetch()

Production Gotchas

Code Comparison

State Machine vs Agent Loop

Bridge Map: State machines / workflows → Agent loops + orchestration

Hands-On Challenges

Build a ReAct Agent Loop

Agent with Safety Guardrails

Research Automation Agent

Acceptance Criteria

Build Roadmap

Discussion

🔄 Agent Loop Animator

⚖️ Agent Pattern Comparator

🔧 Tool-Use

Flow

Pseudocode

✅ Strengths

⚠️ Weaknesses

🔄 ReAct

Flow

Pseudocode

✅ Strengths

⚠️ Weaknesses

🛡️ Guardrail Dashboard

Loop Prevention

Cost Control

Safety

Quality

🔄 Code Evolution — One Function, Five Stages

Step 1: Raw fetch()

Build a ReAct Agent Loop

Agent with Safety Guardrails

Discussion