Agentic Loops & the Claude Agent SDK: Exam Domain 1 Deep Dive

Q: How should an agentic loop decide when to terminate?

Check stop_reason on each response: continue executing tools while it is 'tool_use' and terminate when it is 'end_turn'. Never terminate based on parsing the model's natural language, the presence of text content, or an iteration cap - caps are a safety circuit breaker, not a termination condition.

Q: Why must tool results be appended to conversation history?

The model reasons about its next action from accumulated context. Each loop iteration it re-reads everything it has tried and learned; dropping tool results removes exactly the information it requested and causes it to repeat work.

Q: When should I use prompt chaining versus dynamic decomposition?

Prompt chaining (a fixed sequential pipeline) suits predictable multi-aspect workflows like large code reviews. Dynamic adaptive decomposition suits open-ended investigation, where subtasks should be generated from what each step discovers.

← Back to Claude Certified Architect Course

Lesson 1 of the Claude Certified Architect – Foundations course. Domain 1 (Agentic Architecture & Orchestration) is the heaviest domain on the exam at 27% of scored content — roughly 16 of your 60 questions. This lesson covers the first half of the domain: the agentic loop itself. Lesson 2 covers orchestration, hooks, and sessions.

Back to Course Overview

The Agentic Loop Lifecycle

An agentic loop is the control flow that lets Claude autonomously execute multi-step tasks. The lifecycle the exam expects you to know cold:

Send a request to Claude with the conversation history and available tools
Inspect stop_reason on the response
If stop_reason is "tool_use" → execute the requested tool(s), append the results to conversation history, and loop back to step 1
If stop_reason is "end_turn" → the model has finished; terminate the loop and present the final response

python

messages = [{"role": "user", "content": task}]

while True:
    response = client.messages.create(
        model="claude-sonnet-5",
        max_tokens=4096,
        tools=tools,
        messages=messages,
    )
    messages.append({"role": "assistant", "content": response.content})

    if response.stop_reason == "tool_use":
        tool_results = execute_tools(response.content)   # run each tool_use block
        messages.append({"role": "user", "content": tool_results})
        continue                                          # loop: model reasons over results
    elif response.stop_reason == "end_turn":
        break                                             # task complete

Two properties of this loop matter architecturally:

Tool results are appended to conversation history. That is how the model reasons about its next action — each iteration, Claude sees everything it has tried and everything it learned. If you drop tool results from history, the model loses the information it requested.
Decision-making is model-driven. Claude reasons about which tool to call next based on accumulated context. This is fundamentally different from a pre-configured decision tree or hard-coded tool sequence — and the exam repeatedly tests whether you know when model-driven flexibility is the right choice versus when deterministic enforcement is required (see Lesson 2 on hooks).

Loop Termination: The Anti-Patterns the Exam Loves

Exam distractors in this area are drawn from real mistakes. All three of these are wrong ways to decide the loop is finished:

Anti-pattern	Why it fails
Parsing natural-language signals ("looks like Claude said 'done'")	Brittle; the model's phrasing isn't a contract. `stop_reason` is.
Arbitrary iteration caps as the primary stop (e.g., "stop after 10 turns")	Caps are a safety net, not a termination condition. Tasks legitimately vary in length.
Checking for assistant text content ("if the response has text, we're done")	Responses can contain text and tool calls. Text presence signals nothing.

Exam tip: When a question asks "how should the loop decide to continue or terminate?", the correct answer virtually always centers on stop_reason == "tool_use" vs "end_turn". Distractors dress up the anti-patterns above in plausible language.

An iteration cap is fine as a circuit breaker against runaway loops — the error is making it your primary mechanism.

Model-Driven vs Pre-Configured Decisions

A recurring tradeoff on the exam:

Model-driven — Claude chooses tools and ordering from context. Right default for high-ambiguity work (support requests, research, debugging) where the path can't be known in advance.
Pre-configured / programmatic — code enforces ordering or blocks actions. Required when compliance must be deterministic: identity verification before refunds, spend thresholds, regulated actions. Prompt instructions alone have a non-zero failure rate, which is unacceptable when errors carry financial or legal consequences.

The pattern to remember: prompts guide, hooks and gates guarantee. If the scenario mentions money, compliance, or "must never happen," the answer is programmatic enforcement, not a stronger prompt.

Task Decomposition Strategies

Domain 1 also tests how you split complex work:

Prompt chaining (fixed sequential pipeline). Break a predictable multi-aspect job into ordered focused passes. Canonical example: large code review → analyze each file individually for local issues, then one cross-file integration pass. Fixed pipelines suit workflows whose structure is known upfront.

Dynamic adaptive decomposition. Generate subtasks based on what each step discovers. Canonical example: "add comprehensive tests to a legacy codebase" → first map the structure, identify high-impact areas, produce a prioritized plan, and adapt it as dependencies surface. Use this when the workflow is open-ended investigation.

Choosing between them is a judgment call the exam frames in scenarios: predictable multi-aspect review → chaining; open-ended exploration → adaptive decomposition. Splitting reviews per-file also avoids attention dilution — a 14-file single-pass review produces inconsistent depth and contradictory findings, which is a Domain 4/5 crossover topic.

The Agent SDK Building Blocks

The Claude Agent SDK packages the loop plus production machinery. For the exam, know these primitives (details in Lesson 2):

AgentDefinition — a subagent's description, system prompt, and tool restrictions
Task tool — the mechanism a coordinator uses to spawn subagents; the coordinator's allowedTools must include "Task"
Hooks — e.g., PostToolUse for intercepting and transforming tool results, and tool-call interception for enforcing business rules
Sessions — --resume <session-name> for named resumption, fork_session for parallel exploration branches

Hands-On Exercise

Build a minimal agent with 3 tools (get_customer, lookup_order, process_refund):

Implement the loop with correct stop_reason handling.
Log every iteration: which tool was called, and why (ask the model to state its reasoning).
Deliberately break it: remove tool results from history and observe the model re-requesting the same tool — this cements why results must be appended.
Add an iteration cap of 20 as a circuit breaker and confirm it never triggers in normal operation.

Worked Exam Question

Your agent processes warranty claims. QA reports that some sessions end with the final response cut off mid-sentence, and logs show those responses have stop_reason "max_tokens". Your loop currently treats anything that is not "tool_use" as completion. What should you change?

A. Raise max_tokens and treat "max_tokens" as a distinct case — the response is truncated, not complete, so completion logic must check for "end_turn" specifically.
B. Ask the model to keep responses short so the limit is never reached.
C. Treat "max_tokens" as "end_turn" since the model had finished reasoning.
D. Retry the entire conversation from the start when truncation is detected.

Answer: A. The bug is the catch-all: "not tool_use" is not the same as "finished." A response that stopped for hitting max_tokens is incomplete output, and the loop should handle it explicitly (continue generation or raise the limit) rather than shipping a truncated answer. Option B treats a control-flow bug as a prompting problem, C ships truncated responses by design, and D discards a valid conversation over a recoverable condition. The general lesson mirrors the loop anti-patterns above: termination decisions belong to explicit stop_reason checks, never to assumptions.

Quick Reference: Loop Control Decisions

Signal	Meaning	Correct action
stop_reason "tool_use"	Model needs tools executed	Run tools, append results, continue loop
stop_reason "end_turn"	Model has finished the task	Terminate, present response
Iteration count hits cap	Possible runaway loop	Circuit-break with an error — investigate, don't treat as success
Response contains text	Nothing — text can coexist with tool calls	Never use as a signal
Model says "done" in prose	Nothing — phrasing is not a contract	Never parse for it

One more habit worth building before exam day: when a scenario describes a loop bug, first ask what is the termination condition actually checking? Nearly every Domain 1 loop question resolves to a mismatch between what the code checks and what stop_reason reports.

Key Takeaways for the Exam

Continue on "tool_use", terminate on "end_turn" — never on text parsing, text presence, or iteration caps.
Tool results must be appended to history so the model can reason over them.
Model-driven tool selection is the default; programmatic enforcement when compliance must be guaranteed.
Prompt chaining for predictable pipelines; dynamic decomposition for open-ended investigation.

Next: Lesson 2 — Multi-Agent Orchestration & Hooks

Frequently Asked Questions

How should an agentic loop decide when to terminate?

Check stop_reason on every response: continue executing tools while it returns "tool_use" and terminate only on "end_turn". The exam repeatedly tests the three anti-patterns — parsing the model's prose for completion phrases, treating any text content as completion, and using iteration caps as the primary stop. Caps belong in the design only as circuit breakers against runaway loops, and tripping one should be treated as an error to investigate, never as success.

Why must tool results be appended to conversation history?

Because the conversation history is the model's working memory. Claude decides its next action by re-reading everything it has tried and everything those attempts returned. Drop a tool result and the model will either re-request the same tool or reason from a gap. This is also why trimming verbose tool outputs (covered in Lesson 6) matters: results must be present, but they don't have to be bloated.

When should I use prompt chaining versus dynamic decomposition?

Prompt chaining — a fixed sequential pipeline — fits workflows whose structure is known before you start, like reviewing each file of a pull request and then running a cross-file integration pass. Dynamic decomposition fits open-ended work where the right subtasks only emerge from discovery, like adding tests to an unfamiliar legacy codebase. On the exam, look for the phrase "predictable" or a stated fixed sequence (chaining) versus "depends on what is found" (dynamic).