Annex — The agentic AI landscape
Duration: ~20 min Prerequisites: chapter 09 — Demo 3 and the LangChain & LangGraph annex
The previous annex zoomed into LangChain and LangGraph specifically. This one zooms out: what is an agent, what is the universal loop they all share, what other frameworks exist in the ecosystem, and how to pick the right one for a given problem. The chapter closes with a compact glossary that ties together every term you encountered in this course.
1. What “Agentic AI” really means
Section titled “1. What “Agentic AI” really means”The expression Agentic AI is shorthand for the natural evolution of conversational AI.
| Generation | Behaviour |
|---|---|
| Classical chatbot | You speak, it replies. That is all. |
| Bare LLM | Same idea, with much richer answers. |
| RAG-augmented LLM | Same idea, with access to your private documents. |
| Agent | It observes, decides, acts, then loops. |
1.1. The plain definition
Section titled “1.1. The plain definition”An AI agent is a system that can make decisions and execute actions to reach a goal, without being told step by step what to do.
Instead of saying “do X”, you tell it “reach goal Y”, and it picks the intermediate actions itself.
1.2. A concrete contrast
Section titled “1.2. A concrete contrast”Without an agent (bare LLM):
- You: “how many LangChain job postings are open in Paris?”
- LLM: “I cannot search in real time. I can give you a rough estimate based on what I knew at training time…”
With an agent (LLM + tools):
- You: “how many LangChain job postings are open in Paris?”
- Agent (reasoning): “I need a web search.”
- Agent: calls
tavily_search("LangChain job postings Paris"). - Agent (reading the result): “23 postings found across LinkedIn and Indeed.”
- Agent: produces the final answer with sources.
The agent in demo 3 does exactly this kind of decision-making over four file-system tools instead of a web-search tool.
2. The universal agent loop — Reason, Act, Observe
Section titled “2. The universal agent loop — Reason, Act, Observe”Every agent, regardless of framework, follows the same base loop.
flowchart TB Start([User question]) --> Loop subgraph Loop["Agent loop"] direction TB Reason{{"1. Reason"}} Reason -->|"decides to use a tool"| Act["2. Act (call the tool)"] Act --> Observe["3. Observe (read the result)"] Observe --> Reason end Reason -->|"has the final answer"| End([Answer])| Step | What happens |
|---|---|
| Reason | The LLM decides: “do I have enough information to answer, or do I need to act?” |
| Act | If acting: it calls one tool (web search, SQL, Python execution, file write…). |
| Observe | The tool’s output is fed back into the conversation. |
| Repeat | The LLM re-reads everything and decides again. |
| Stop | When it has the final answer, it exits the loop. |
This pattern has a name: ReAct — short for Reasoning + Acting. It comes from a 2022 paper by Yao et al. (Princeton + Google Research). The more advanced patterns (Reflection, Reflexion, Plan-and-Execute, Self-Ask) are variations on this same loop.
The 30-line agent loop in demo 3 is a direct, minimal implementation of ReAct. LangChain’s create_agent, LangGraph’s create_react_agent, AutoGen’s AssistantAgent — they all run the same loop under the hood, with more bookkeeping around it.
3. The wider framework ecosystem
Section titled “3. The wider framework ecosystem”LangChain is not alone. Here are the main actors you will meet in the wild.
| Framework | Style / specialty | Strengths | Limits |
|---|---|---|---|
| LangChain | Chain composition (LCEL) + agents | Massive ecosystem, multi-provider, hundreds of integrations | Can feel heavy for very simple cases |
| LangGraph | Stateful graph orchestration | Complex workflows, multi-agent, fine control, shared state | Steeper learning curve |
| LlamaIndex | Specialised in RAG and data indexing | Excellent on complex, hierarchical, multimodal RAG | Less agent-oriented than LangChain |
| Haystack (deepset) | Production-grade retrieval pipelines | Very solid in production, strong enterprise ecosystem | Less flexible for highly iterative agents |
| AutoGen (Microsoft) | Multi-agent conversation | Conversational patterns between agents (assistant ↔ critic ↔ user) | More experimental, less stable API |
| CrewAI | Role-based “team” of agents | Very easy to pick up, intuitive | Less fine control than LangGraph |
| Smolagents (Hugging Face) | Code-based minimalist agents | Lightweight, transparent, the agent’s output is Python code | Young, smaller ecosystem |
| OpenAI Agents SDK | Official OpenAI SDK for building agents | Native integration, very simple | Locked to OpenAI’s API |
| Anthropic Claude tool use | Claude API with tool calling | Excellent reasoning, ideal for reliable agents | Not a full framework, must be completed |
| DSPy (Stanford) | Automatic prompt optimisation | ”Compilation” approach to LLM programs, very powerful in research | Different paradigm, more academic |
3.1. The licensing question, once and for all
Section titled “3.1. The licensing question, once and for all”The most common confusion: “are these frameworks commercial?”
The frameworks themselves are all open source and free to use.
| Project | Licence | Owner |
|---|---|---|
| LangChain | MIT | LangChain Inc. |
| LangGraph | MIT | LangChain Inc. |
| LlamaIndex | MIT | LlamaIndex Inc. |
| Haystack | Apache 2.0 | deepset |
| AutoGen | MIT | Microsoft |
| CrewAI | MIT | CrewAI Inc. |
| Smolagents | Apache 2.0 | Hugging Face |
| DSPy | MIT | Stanford NLP |
What is commercial are the hosted services that sit next to them:
- LangSmith (tracing + evaluation by the LangChain team).
- LangGraph Cloud / Platform (managed deployment of LangGraph apps).
- LlamaCloud (managed indexing/parsing by LlamaIndex).
- OpenAI / Anthropic APIs (the LLM providers themselves).
You can build a perfectly serious production system using only the open-source frameworks against a local model like the ones we use in this course, with zero recurring cost.
4. How to pick the right framework
Section titled “4. How to pick the right framework”4.1. A decision tree
Section titled “4.1. A decision tree”flowchart TD Start([What is your need?]) --> Q1{Just chat with an LLM?} Q1 -->|"yes"| SDK["Official SDK (OpenAI, Anthropic, Ollama)"] Q1 -->|"no"| Q2{Mainly RAG on your documents?} Q2 -->|"mostly RAG"| LI["LlamaIndex or LangChain"] Q2 -->|"no, or not only that"| Q3{Linear workflow?} Q3 -->|"yes"| LC["LangChain LCEL"] Q3 -->|"no, loops or branches"| Q4{Several agents collaborating?} Q4 -->|"yes"| Q5{Preference?} Q5 -->|"fine control"| LG["LangGraph"] Q5 -->|"simplicity, roles"| Crew["CrewAI"] Q5 -->|"agent-to-agent chat"| AG["AutoGen"] Q4 -->|"no, one iterative agent"| LG2["LangGraph"]4.2. A rule of thumb to remember
Section titled “4.2. A rule of thumb to remember”| You want… | Use |
|---|---|
| One LLM call with a formatted prompt | LangChain (LCEL) |
| An agent with a list of tools | LangChain create_agent |
| A workflow with loops, branches, shared state | LangGraph |
| Deep RAG over complex documents | LlamaIndex |
| A team of agents with explicit roles | CrewAI |
| Conversations between agents (assistant ↔ critic) | AutoGen |
| The simplest possible SDK | OpenAI Agents SDK |
| Native code as the agent’s “thought” | Smolagents |
| Compile prompts automatically | DSPy |
4.3. Two anti-patterns to avoid
Section titled “4.3. Two anti-patterns to avoid”Anti-pattern 1 — start with the heaviest framework. Beginners often reach for LangGraph or AutoGen before they need them. The reverse order works better: start with one bare API call, add tools, hit a wall, then pick the framework that solves that specific wall.
Anti-pattern 2 — stack frameworks. Using LangChain + LlamaIndex + AutoGen + CrewAI in the same project usually means each one fights the other. Pick one primary framework and stay with it.
5. Concise glossary
Section titled “5. Concise glossary”This is the one-page reference you can come back to whenever a term feels fuzzy.
| Term | One-line definition |
|---|---|
| LLM | Large Language Model — model that predicts the next token. |
| Token | A fragment of a word (~4 characters in English); the basic unit of an LLM. |
| Prompt | The instruction sent to the LLM. |
| Prompt template | A prompt with variables ({name}, {language}). |
| System prompt | The first message of the conversation; sets the model’s persona and rules. |
| Context window | The maximum number of tokens the LLM can read at once. |
| Cutoff date | The date up to which the LLM saw data during training. |
| Hallucination | A response that sounds right but is wrong. |
| Function calling / Tool calling | The model’s ability to request the call of a structured function. |
| Tool | A function the model can call. |
| Agent | An LLM + tools + a reasoning loop. |
| Chain | A pipeline of steps (prompt → LLM → parser). |
| LCEL | LangChain Expression Language — composition with the ` |
| Runnable | Any LangChain block exposing .invoke(), .stream(), .batch(). |
| Memory | Conversational-memory system. |
| Buffer / summary / vector memory | The three common memory implementations. |
| Embedding | A numerical vector that represents a piece of text. |
| Vector store | A database that stores embeddings and searches by similarity. |
| Retriever | A component that returns the documents most relevant to a question. |
| RAG | Retrieval-Augmented Generation — retriever + LLM combination. |
| Chunking | Splitting documents into pieces before indexing them for RAG. |
| Streaming | Receiving the answer token by token as it is generated. |
| Tracing | Recording every step of an execution for debugging and audit. |
| LangSmith | The official tracing and evaluation platform of the LangChain ecosystem (commercial hosted service). |
| ReAct | Pattern Reasoning + Acting, the universal agent loop (Yao et al. 2022). |
| Reflection | Pattern where the agent critiques its own output. |
| Reflexion | Reflection + tool-use + revision with citations. |
| Plan-and-Execute | Pattern where one agent plans steps and another executes them. |
| Adaptive RAG | RAG that dynamically picks its source (vector store vs web search). |
| Self-RAG / Corrective RAG | RAG variants where the model evaluates its own retrieval quality. |
| Multi-agent | Several agents collaborating, possibly with explicit roles. |
| Quantization | Compressing model weights to a lower precision — see the quantization annex. |
| VRAM | Memory of the GPU, where the model’s weights live during inference — see the hardware annex. |
6. Where to go next
Section titled “6. Where to go next”You now have the full mental map of the LLM-application ecosystem.
- If you want to run a real agent end to end, the work is already done in demo 3 (single agent) and demo 4 (three collaborating agents).
- If you want to switch to a framework on top of demo 3, the LangChain replacement code is shown side by side with the hand-coded loop in the “Hand-coded loop vs LangChain” section of chapter 09.
- If you want to specialise a code model for your own language stack, the path is described in chapter 14 — fine-tuning a code model.