Tool use cơ bản: function calling, JSON schema, error handling

Tháng trước, hệ thống của chúng tôi có một con agent quản lý tài khoản người dùng. Agent này có hai tool: delete_user và delete_account. Tên khác nhau, logic khác nhau. delete_user xóa session và reset password. delete_account xóa toàn bộ dữ liệu, không khôi phục được.

Một ngày, support team yêu cầu agent “xóa user test khỏi hệ thống”. Agent hiểu “xóa” là yêu cầu xóa triệt để, và nó gọi delete_account. Đúng tool sai context. Data đã mất.

Không phải LLM hallucinate. Không phải agent mất kiểm soát. Lỗi ở schema: cả hai tool có description mơ hồ, không nói rõ sự khác biệt quan trọng nhất giữa chúng. LLM chọn tool dựa trên description. Khi description không phân biệt rõ, LLM đoán. Và đoán sai.

Phần còn lại của bài đi vào layer “tools” trong 4-thành-phần mental model từ bài 1: viết JSON schema cho LLM hiểu đúng, function calling với Anthropic SDK, handle error để model còn tự sửa được, và idempotency để retry không gây hại.

Bắt đầu từ JSON schema cho tool

Khi bạn khai báo một tool cho LLM, bạn đang viết một contract. LLM đọc contract đó và quyết định: gọi tool này hay không, gọi với args nào. Nếu contract mơ hồ, quyết định sẽ sai.

Điều này khác với document dành cho con người. Khi dev đọc API doc, họ có context từ kinh nghiệm, từ codebase, từ Slack để disambiguate. LLM chỉ có đúng những gì bạn viết trong schema tại thời điểm gọi. Không có gì khác. Nếu schema viết “delete user from system” cho cả hai tool xóa session và xóa account, LLM sẽ chọn ngẫu nhiên vì cả hai đều match.

Thói quen tốt nhất là viết description như thể đang giải thích cho một intern mới, chưa biết gì về codebase: mô tả use case thật, contraindication rõ ràng, và side effect cụ thể.

Schema một tool có ba thành phần bắt buộc: name, description, và input_schema.

tool = {
    "name": "send_email",
    "description": "Send a transactional email to a single recipient. Use this only for system-triggered emails (password reset, order confirmation). Do NOT use for bulk or marketing emails.",
    "input_schema": {
        "type": "object",
        "properties": {
            "to": {
                "type": "string",
                "description": "Recipient email address. Must be a valid email format."
            },
            "subject": {
                "type": "string",
                "description": "Email subject line. Keep under 60 characters."
            },
            "body": {
                "type": "string",
                "description": "Plain text body. Do not include HTML tags."
            },
            "template_id": {
                "type": "string",
                "description": "Optional. If provided, overrides subject and body with a pre-defined template. Valid values: 'password_reset', 'order_confirm', 'welcome'.",
                "enum": ["password_reset", "order_confirm", "welcome"]
            }
        },
        "required": ["to", "subject", "body"]
    }
}

Vài điểm cần để ý:

name phải rõ hành động. send_email tốt hơn email. delete_user_session tốt hơn remove_user. Khi có nhiều tool tương tự, tên là tín hiệu phân biệt đầu tiên mà LLM dùng.

description là hướng dẫn sử dụng, không phải tên. Ghi rõ: dùng khi nào, không dùng khi nào, side effect là gì. Nếu tool có destructive action (xóa, ghi đè, gửi đi ngoài), nói thẳng trong description.

Property description quan trọng như type. LLM không tự suy ra template_id có valid values là gì nếu không nói. Dùng enum khi có tập giá trị hữu hạn, và ghi trong description ý nghĩa từng value.

required phải chính xác. Nếu template_id là optional thực sự, đừng để trong required. LLM sẽ cố điền mọi required field; nếu không có data, nó hallucinate.

Type system trong JSON schema

JSON schema hỗ trợ các type: string, number, integer, boolean, array, object, null.

"input_schema": {
    "type": "object",
    "properties": {
        "user_ids": {
            "type": "array",
            "items": {"type": "string"},
            "description": "List of user IDs to process. Max 100 per call."
        },
        "dry_run": {
            "type": "boolean",
            "description": "If true, simulate the action without making changes. Default false."
        },
        "priority": {
            "type": "integer",
            "minimum": 1,
            "maximum": 10,
            "description": "Processing priority. 1 = lowest, 10 = highest."
        }
    },
    "required": ["user_ids"]
}

minimum/maximum cho number/integer giúp LLM biết range hợp lệ. items cho array giúp LLM biết mỗi phần tử có type gì. Càng nhiều constraint trong schema, LLM càng ít phải đoán.

Một lưu ý thực tế: Anthropic SDK hiện tại hỗ trợ JSON schema draft-07. Các feature nâng cao như $ref, if/then/else có thể không được xử lý đúng. Giữ schema đơn giản, flat khi có thể.

Một điểm dễ quên khác: đừng để schema quá dài. Mỗi tool schema được inject vào context window của LLM. Nếu bạn có 20 tool, mỗi tool có schema 200 token, bạn đã dùng 4000 token chỉ để khai báo tools trước khi LLM làm gì cả. Với agent có nhiều tool, cân nhắc group tool theo domain và chỉ inject tool set phù hợp với task hiện tại thay vì inject toàn bộ mọi lúc.

Gọi tool bằng Anthropic SDK

Flow đầy đủ, từ khai báo tool đến xử lý response, trông như sau.

import anthropic
import json

client = anthropic.Anthropic()

# Định nghĩa tools
TOOLS = [
    {
        "name": "get_user",
        "description": "Retrieve user information by user ID. Returns user details including name, email, and account status.",
        "input_schema": {
            "type": "object",
            "properties": {
                "user_id": {
                    "type": "string",
                    "description": "The unique user identifier (UUID format)."
                }
            },
            "required": ["user_id"]
        }
    },
    {
        "name": "deactivate_user_session",
        "description": "Deactivate a user's active session, forcing them to log in again. Does NOT delete the account or any data. Use when: user requests logout, security incident, expired session cleanup.",
        "input_schema": {
            "type": "object",
            "properties": {
                "user_id": {
                    "type": "string",
                    "description": "The unique user identifier."
                },
                "reason": {
                    "type": "string",
                    "description": "Reason for deactivation. Used for audit log.",
                    "enum": ["user_request", "security_incident", "admin_action", "expired"]
                }
            },
            "required": ["user_id", "reason"]
        }
    }
]

# Giả lập database
USERS = {
    "usr_001": {"name": "Alice", "email": "[email protected]", "status": "active"},
    "usr_002": {"name": "Bob", "email": "[email protected]", "status": "inactive"},
}

def execute_tool(name: str, args: dict) -> str:
    """Execute a tool and return result as string for LLM."""
    if name == "get_user":
        user = USERS.get(args["user_id"])
        if not user:
            return json.dumps({"error": f"User {args['user_id']} not found"})
        return json.dumps(user)

    if name == "deactivate_user_session":
        user_id = args["user_id"]
        if user_id not in USERS:
            return json.dumps({"error": f"User {user_id} not found"})
        return json.dumps({
            "success": True,
            "user_id": user_id,
            "action": "session_deactivated",
            "reason": args["reason"]
        })

    return json.dumps({"error": f"Unknown tool: {name}"})


def run_agent(user_input: str, max_iter: int = 5) -> str:
    messages = [{"role": "user", "content": user_input}]

    for iteration in range(max_iter):
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=1024,
            tools=TOOLS,
            messages=messages,
        )

        # Append assistant response vào history
        messages.append({"role": "assistant", "content": response.content})

        if response.stop_reason == "end_turn":
            # Lấy text block cuối cùng
            for block in response.content:
                if hasattr(block, "text"):
                    return block.text
            return ""

        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result,
                    })
            messages.append({"role": "user", "content": tool_results})
            continue

    return f"Max iterations ({max_iter}) reached without completing task."


# Test
print(run_agent("Lấy thông tin user usr_001 và log out họ vì lý do security incident"))

Điểm quan trọng trong code:

Mỗi tool_use block có block.id. Khi gửi tool_result lại, phải map đúng tool_use_id. Nếu bạn có N tool calls trong một response, phải có đúng N tool results tương ứng. Thiếu một kết quả, Anthropic API sẽ trả về lỗi validation.

response.content là list, không phải single block. Một response có thể chứa cả text lẫn tool_use blocks. Phải loop qua tất cả và xử lý từng loại.

Claude Sonnet 4.6 đôi khi gọi nhiều tool song song trong một response. Ví dụ khi bạn hỏi “Lấy thông tin user X và kiểm tra order của họ”, model có thể ra một response với hai tool_use block cùng lúc: get_user và get_orders. Code trên đã xử lý đúng trường hợp này vì nó collect tất cả tool_use trong vòng lặp rồi gửi lại tất cả kết quả trong một batch. Behavior này đáng giữ: hai tool không phụ thuộc nhau thì thực thi song song, giảm latency so với sequential calls.

So sánh nhanh với OpenAI SDK

Format khác nhau nhưng concept giống nhau:

# OpenAI format
tools_openai = [
    {
        "type": "function",
        "function": {
            "name": "get_user",
            "description": "...",
            "parameters": {  # OpenAI dùng "parameters", Anthropic dùng "input_schema"
                "type": "object",
                "properties": {...},
                "required": [...]
            }
        }
    }
]

# OpenAI response parsing
if response.choices[0].finish_reason == "tool_calls":  # khác "tool_use"
    for tool_call in response.choices[0].message.tool_calls:
        name = tool_call.function.name
        args = json.loads(tool_call.function.arguments)  # OpenAI trả về string, phải parse
        # Anthropic: block.input đã là dict rồi

Series này dùng Anthropic SDK làm reference. Khi dùng OpenAI, hai điểm khác nhau: parameters thay input_schema, và arguments là JSON string thay vì dict.

Error handling: chỗ tool use bắt đầu khó

Phần này rất dễ bị handle sai. Khi tool fail, bạn có hai lựa chọn: return error hoặc raise exception. Hai lựa chọn đó dẫn tới hai hành vi khác nhau trong agent loop.

Một mental model hữu ích: tool execution layer là một mini-server nhận request từ LLM và trả response. Thiết kế tool responses giống thiết kế REST API: status rõ ràng, error message có thể act on, không trả về internal implementation detail.

Return error cho LLM tự xử lý:

def execute_tool(name: str, args: dict) -> str:
    if name == "get_user":
        user_id = args.get("user_id", "")

        # Validate trước khi gọi service
        if not user_id:
            return json.dumps({
                "error": "Missing required field: user_id",
                "hint": "Provide a valid UUID for user_id"
            })

        user = database.get_user(user_id)
        if not user:
            return json.dumps({
                "error": f"User not found: {user_id}",
                "hint": "Check if the user_id is correct, or use list_users to find available IDs"
            })

        return json.dumps(user)

Khi return error với hint, LLM có thể tự sửa. Ví dụ: LLM gọi get_user với user_id = "alice" (tên thay vì UUID), tool trả về error kèm hint, LLM hiểu và gọi lại với user_id = "usr_001". Retry flow này chỉ hoạt động khi error message đủ descriptive.

Raise exception để abort agent:

def execute_tool(name: str, args: dict) -> str:
    if name == "delete_account":
        user_id = args["user_id"]

        # Lỗi không thể tự recover: permissions sai
        if not current_user_has_permission("admin"):
            raise PermissionError(f"delete_account requires admin permission, current user is: {current_user()}")

        # Lỗi không thể tự recover: user đang có active orders
        if database.has_active_orders(user_id):
            raise ValueError(f"Cannot delete account {user_id}: has {database.count_active_orders(user_id)} active orders")

        database.delete_account(user_id)
        return json.dumps({"success": True})

Raise khi lỗi không thể tự sửa bằng cách thử lại, hoặc khi tiếp tục thực hiện sẽ gây hại. Caller (agent loop) bắt exception và dừng. LLM không được tiếp tục trong trường hợp này.

Format error message cho LLM hiểu:

def format_error(error_type: str, detail: str, hint: str = "") -> str:
    """Format error response LLM có thể parse và act on."""
    payload = {"error": error_type, "detail": detail}
    if hint:
        payload["hint"] = hint
    return json.dumps(payload)

# Dùng trong tool
return format_error(
    error_type="NOT_FOUND",
    detail=f"Order {order_id} does not exist in the system",
    hint="Use search_orders with customer_email to find the correct order ID"
)

Ba field đủ để LLM hiểu: error (type ngắn gọn), detail (mô tả cụ thể), hint (gợi ý bước tiếp theo). Tránh trả về stack trace hoặc SQL error trực tiếp, LLM sẽ cố gắng “fix” code thay vì fix logic.

Phân biệt recoverable và non-recoverable

Loại lỗi	Xử lý	Ví dụ
Input validation sai	Return error + hint	user_id format sai
Resource không tồn tại	Return error + hint	User/Order không tìm thấy
Permission denied	Raise exception	Không có quyền admin
Data consistency violated	Raise exception	Order đang active không xóa được
External service timeout	Return error + hint	API rate limit, retry sau
Unknown/unexpected error	Raise exception	Database connection lost

Idempotency để retry không phá dữ liệu

Khi LLM retry một tool call (vì lần đầu fail hoặc LLM không chắc có success không), tool phải idempotent để retry không gây double action.

Vấn đề thực tế: agent gửi email xác nhận đơn hàng. Tool fail do network timeout sau 30 giây. LLM không nhận được confirmation, gọi lại. Email được gửi hai lần. User khiếu nại. Tệ hơn: nếu tool là charge_credit_card, retry không idempotent nghĩa là khách bị charge hai lần.

Bất kỳ tool nào có effect ngoài process (gửi email, ghi DB, gọi external API, charge tiền) đều cần idempotency. Tool read-only (get_user, list_orders) thì không cần, retry bao nhiêu lần cũng OK.

Pattern 1: Idempotency key từ caller

TOOLS = [
    {
        "name": "send_notification",
        "description": "Send a notification to a user. Include idempotency_key to prevent duplicate sends on retry.",
        "input_schema": {
            "type": "object",
            "properties": {
                "user_id": {"type": "string"},
                "message": {"type": "string"},
                "idempotency_key": {
                    "type": "string",
                    "description": "Unique key for this operation. Use format: '{action}_{resource_id}_{timestamp_seconds}'. Example: 'notify_order_123_1716000000'. Same key = same operation, safe to retry."
                }
            },
            "required": ["user_id", "message", "idempotency_key"]
        }
    }
]

# Server-side dedup
SENT_NOTIFICATIONS = {}  # production: Redis hoặc DB

def execute_tool(name: str, args: dict) -> str:
    if name == "send_notification":
        key = args["idempotency_key"]

        if key in SENT_NOTIFICATIONS:
            # Đã xử lý rồi, trả về result cũ
            return json.dumps({
                "success": True,
                "idempotent": True,
                "detail": f"Notification already sent (key: {key})"
            })

        # Gửi thật
        result = notification_service.send(args["user_id"], args["message"])
        SENT_NOTIFICATIONS[key] = result
        return json.dumps({"success": True, "idempotent": False})

Pattern 2: Check-then-act

Với tool đơn giản không cần idempotency key:

def execute_tool(name: str, args: dict) -> str:
    if name == "add_tag_to_user":
        user_id = args["user_id"]
        tag = args["tag"]

        # Check trước khi act
        existing_tags = database.get_user_tags(user_id)
        if tag in existing_tags:
            return json.dumps({
                "success": True,
                "detail": f"Tag '{tag}' already exists on user {user_id}",
                "action_taken": False
            })

        database.add_tag(user_id, tag)
        return json.dumps({
            "success": True,
            "detail": f"Tag '{tag}' added to user {user_id}",
            "action_taken": True
        })

action_taken trong response giúp LLM biết có thật sự thay đổi gì không, tránh nó report sai với user.

Pattern 3: Dedup bằng UUID từ agent

Nếu agent loop có state, sinh UUID một lần và truyền qua mọi tool call trong cùng task:

import uuid

def run_agent(user_input: str, task_id: str = None) -> str:
    task_id = task_id or str(uuid.uuid4())
    system_prompt = f"You are processing task {task_id}. Use this task_id as prefix for any idempotency_key: '{task_id}_{{action}}_{{resource}}'."

    messages = [{"role": "user", "content": user_input}]
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1024,
        system=system_prompt,
        tools=TOOLS,
        messages=messages,
    )
    # ...

Server-side validation: không tin args từ LLM

Dù schema đã khai báo rõ, server vẫn phải validate. LLM có thể gửi sai type, thiếu field, hoặc value ngoài range. Treat LLM như bất kỳ HTTP client nào: không tin input blind.

from typing import Any

def validate_args(tool_name: str, args: dict) -> tuple[bool, str]:
    """Return (valid, error_message)."""
    if tool_name == "send_notification":
        if not isinstance(args.get("user_id"), str) or not args["user_id"]:
            return False, "user_id must be a non-empty string"
        if not isinstance(args.get("message"), str) or len(args["message"]) > 1000:
            return False, "message must be a string under 1000 characters"
        key = args.get("idempotency_key", "")
        if not key or len(key) > 128:
            return False, "idempotency_key must be 1-128 characters"
        return True, ""
    return True, ""

def execute_tool(name: str, args: dict) -> str:
    valid, error_msg = validate_args(name, args)
    if not valid:
        return format_error("VALIDATION_ERROR", error_msg, "Check the tool schema for correct argument format")
    # ... tiếp tục xử lý

Pitfall tôi luôn kiểm tra: tên tool quá generic

Nhìn lại incident đầu bài, root cause không chỉ là tên tool. Schema lúc đó như sau:

# SCHEMA CŨ (nguy hiểm)
{
    "name": "delete_user",
    "description": "Delete a user from the system.",
    "input_schema": {
        "type": "object",
        "properties": {
            "user_id": {"type": "string"}
        },
        "required": ["user_id"]
    }
},
{
    "name": "delete_account",
    "description": "Delete an account from the system.",
    "input_schema": {
        "type": "object",
        "properties": {
            "user_id": {"type": "string"}
        },
        "required": ["user_id"]
    }
}

Hai tool có cùng structure, description chỉ khác một từ (“user” vs “account”), không có gì phân biệt severity. Khi LLM nhận “xóa user”, nó chọn ngẫu nhiên giữa hai tool này.

Fix đúng là rewrite description rõ ràng, và thêm warning cho destructive action:

# SCHEMA MỚI (an toàn hơn)
{
    "name": "deactivate_user_session",
    "description": "Deactivate a user's active login session. The user will be logged out but their account and all data remain intact. This is REVERSIBLE. Use for: logout requests, security incidents, session cleanup.",
    "input_schema": {
        "type": "object",
        "properties": {
            "user_id": {"type": "string", "description": "User identifier."},
            "reason": {
                "type": "string",
                "enum": ["user_request", "security_incident", "admin_action"],
                "description": "Reason for deactivation, used in audit log."
            }
        },
        "required": ["user_id", "reason"]
    }
},
{
    "name": "permanently_delete_account",
    "description": "PERMANENT and IRREVERSIBLE deletion of a user account and ALL associated data. Use ONLY when user explicitly requests account deletion and has confirmed understanding that data cannot be recovered. Do NOT use for logout or temporary access removal.",
    "input_schema": {
        "type": "object",
        "properties": {
            "user_id": {"type": "string"},
            "confirmation_token": {
                "type": "string",
                "description": "Token from user's deletion confirmation email. Required to prevent accidental deletion."
            }
        },
        "required": ["user_id", "confirmation_token"]
    }
}

Hai thay đổi quan trọng: tên tool mô tả chính xác action (deactivate_user_session vs permanently_delete_account), và description dùng chữ hoa PERMANENT, IRREVERSIBLE, Do NOT use để LLM không nhầm severity. Thêm confirmation_token vào required cho destructive tool là một safety gate tốt: nếu LLM không có token, nó không thể gọi tool.

Đi sâu hơn về tool design pattern có ở bài 11: Tool design, schema, validation.

Checklist tôi dùng trước khi ship tool

Vấn đề	Pattern	Ghi chú
LLM gọi sai tool	Tên rõ action, description phân biệt severity	Dùng CAPS cho destructive warning
LLM truyền arg sai type	Validate server-side, return error + hint	Không trust LLM args
Tool fail, LLM không biết sửa	Return error với `hint` field	Cho biết bước tiếp theo
Tool fail, không recover được	Raise exception, abort agent	Permission, data constraint
Retry gây duplicate action	Idempotency key + server-side dedup	Prefix: `{task_id}_{action}_{resource}`
Enum values LLM đoán	Khai báo `enum` trong schema	Kèm description ý nghĩa từng value
Required field LLM hallucinate	Chỉ đặt field thật sự bắt buộc vào `required`	Optional field để ngoài

Anthropic SDK	OpenAI SDK
`input_schema`	`parameters`
`block.input` (dict)	`tool_call.function.arguments` (JSON string)
`stop_reason == "tool_use"`	`finish_reason == "tool_calls"`
`tool_result` trong message	`role: "tool"` message riêng

Chốt lại: tool schema là interface thật

Tool là điểm agent chạm vào thế giới. Viết schema tốt không phải kỹ năng phụ; đó là kỹ năng chính quyết định agent có làm đúng hay không. LLM chỉ biết về tool qua description bạn viết. Nếu description mơ hồ, agent sẽ đoán. Và đoán trong production thì nguy hiểm.

Ba điểm cần nhớ: mô tả rõ khi nào dùng, khi nào không dùng, side effect là gì trong description. Return error có hint để LLM tự sửa. Raise exception chỉ khi thật sự không recover được. Và với mọi tool có external effect, thêm idempotency.

Sau tool use là Control loop: ReAct, agentic loop, điều kiện dừng, phần dễ gây tốn tiền nhất: vòng lặp quyết định khi nào tiếp tục, khi nào dừng, và khi nào abort.