Tool calling: the bridge
Duration: 10 min Prerequisites: chapter 02
Key idea
Section titled “Key idea”Tool calling is a simple protocol: we tell the model “here is a list of functions you can call”, and instead of generating text, it emits a JSON describing the call to make. Your Python code intercepts that JSON, runs the real function, hands the result back. Loop.
The protocol in 4 steps
Section titled “The protocol in 4 steps”
sequenceDiagram
participant User as User
participant Code as Python loop
participant LLM as Ollama (LLM)
participant Tool as Python tool
User->>Code: user prompt
Code->>LLM: messages + tool list
LLM-->>Code: tool_calls = [{"name":"write_file"}]
Code->>Tool: write_file(path="Main.java", content="...")
Tool-->>Code: "File created or modified: Main.java"
Code->>LLM: append result to messages
LLM-->>Code: tool_calls = [{"name":"compile_java"}]
Code->>Tool: javac via subprocess
Tool-->>Code: "Compilation successful."
Code->>LLM: append, re-ask
LLM-->>Code: no tool_call, just text
Code-->>User: final text
As long as the model returns tool_calls, we execute them and hand control back. When it stops emitting them (just a plain text message), we exit the loop.
How a tool is described
Section titled “How a tool is described”The key point: nothing is described manually. The ollama-python SDK automatically builds the JSON schema from the Python signature and the docstring. From ollama-demo-3-agent-java/agent_java.py:
def write_file(path: str, content: str = "") -> str: """Create or overwrite a file in the workspace folder.
Args: path: Relative path of the file to create, for example Main.java. content: Full content to write to the file. """ # ... implementation ...The SDK reads:
- the name:
write_file; - the typed arguments:
path: str,content: str = ""(socontentis optional); - the description from the docstring;
- the description of each arg from the
Args:section.
It then sends Ollama a JSON-schema that the model uses to format its answer. That’s why in ollama-demo-3-agent-java/agent_java.py you just see:
tools = [list_files, read_file, write_file, compile_java]
response = client.chat( model=MODEL_NAME, messages=messages, tools=tools, options={"num_ctx": 20480},)Four ordinary Python functions, passed as-is. No framework, no implicit decorator.
Concrete example: the full chain Python signature → JSON schema → tool_calls → invocation
For the write_file function above, here is what happens on the wire when the user asks “Create Main.java that prints Hello world.”
1. What the SDK generates from the Python signature and the docstring
{ "type": "function", "function": { "name": "write_file", "description": "Create or overwrite a file in the workspace folder.", "parameters": { "type": "object", "properties": { "path": {"type": "string", "description": "Relative path of the file to create, for example Main.java."}, "content": {"type": "string", "description": "Full content to write to the file."} }, "required": ["path"] } }}2. What the model returns in response.message.tool_calls
{ "role": "assistant", "content": "", "tool_calls": [ { "function": { "name": "write_file", "arguments": { "path": "Main.java", "content": "public class Main {\n public static void main(String[] args) {\n System.out.println(\"Hello world\");\n }\n}\n" } } } ]}3. What the Python loop does with it
name = "write_file"args = {"path": "Main.java", "content": "public class Main { ... }"}result = write_file(**args)# -> the file Main.java is actually created on disk;# -> result is the string "OK: 96 bytes written to Main.java".4. What gets appended to messages for the next turn
{"role": "tool", "tool_name": "write_file", "content": "OK: 96 bytes written to Main.java"}The model sees this confirmation on its next turn, knows the action succeeded, and can now decide to compile, read the file again, or stop. The whole “agent” effect comes from this four-step ping-pong, repeated until the model emits a plain text answer with no tool_calls.
The agent loop in 30 lines
Section titled “The agent loop in 30 lines”Here is the skeleton of the loop in ollama-demo-3-agent-java/agent_java.py (around line 486):
for step in range(1, MAX_STEPS + 1): response = client.chat( model=MODEL_NAME, messages=messages, tools=tools, options={"num_ctx": 20480}, ) messages.append(response.message)
calls = list(iter_tool_calls(response.message)) if not calls: break # The model is done, nothing left to do
for name, args in calls: fn = available_functions.get(name) if fn is None: result = f"Unknown tool: {name}" else: try: result = fn(**args) except Exception as error: result = f"Error while running the tool: {error}"
messages.append( {"role": "tool", "tool_name": name, "content": str(result)} )Read it three times. The whole agent is there. Everything else — Streamlit UI, per-project isolation, collaborating agents — is decoration around this loop.
Each iteration produces:
- a question to the model with the full history;
- zero, one or several tool calls;
- the real execution of each tool;
- the append of the result to
messagesso the model sees it on the next turn.
MAX_STEPS = 10 prevents infinite loops: if the model isn’t done in 10 turns, we cut it off.
The classroom one-liner
Section titled “The classroom one-liner”The model thinks. The tools act. The compiler verifies. The human validates.
If you can point each line at the loop code, you know what an agent is:
- “the model thinks” → the call to
client.chat(...); - “the tools act” → the
for name, args in calls: result = fn(**args)loop; - “the compiler verifies” → the
compile_java()tool that runssubprocess.run(["javac", ...]); - “the human validates” → that’s you watching the output and deciding whether to rerun.
What the model had to learn to do this
Section titled “What the model had to learn to do this”Tool calling is not a feature of the raw LLM. It’s a learned behaviour that comes from fine-tuning. Meta trained Llama 3.1 on millions of examples like:
System: You have access to these tools: [...]User: Compile my code.Assistant: <tool_call>{"name":"compile_java"}</tool_call>If you take a model that did not receive this fine-tuning (e.g. a raw Llama 2), it will not be able to generate these structured tool_calls. That’s the rabbit hole we go down in chapter 05b.
Key takeaways
Section titled “Key takeaways”- Tool calling = a protocol where the LLM outputs JSON, your code runs the real function.
ollama-pythonbuilds the schema automatically from your Python signatures + docstrings. No boilerplate.- The agent loop is ~30 lines: call model → run tools → recall model → … → stop when no more tool_calls.
- Not every model can do this: you need a model fine-tuned for tool calling (chapter 05).
- The model thinks, the tools act, the compiler verifies, the human validates.