Skip to main content
Ctrl+K
AI Study Roadmap - Home AI Study Roadmap - Home Organization logo Organization logo

AI Study Roadmap

Ctrl+K
  • GitLab
AI Study Roadmap - Home AI Study Roadmap - Home

AI Study Roadmap

Ctrl+K
  • GitLab
Table of Contents
dashboard Overview
smart_toy What Is AI Engineering? expand_more
AI Training Overview
Foundations expand_more
Introduction to AI & Generative AI
Introduction to RAG and Theoretical Foundations
LangChain Framework and Core Components
Modern RAG Architecture
Practice — AI Foundations chevron_right
Basic AI Fundamentals Slides Basic AI Fundamentals - Final Assignment
Assessment — Foundations chevron_right
Quiz and Summary
RAG Optimization expand_more
Advanced Indexing
Advanced Retrieval Strategies
Query Transformations
Re-ranking
GraphRAG Implementation
Multimodal RAG NEW
Practice — RAG Optimization chevron_right
Assignment: Hybrid Search Assignment: Post-Retrieval Processing Assignment: Query Transformation Assignment: Advanced Indexing Assignment: GraphRAG Implementation Assignment: Tool Calling & Tavily Search Integration
Assessment — RAG Optimization chevron_right
Quiz Quiz Quiz Quiz Quiz Quiz for LangGraph and Agentic AI module
Agents expand_more
LangGraph Foundations & State Management
Tool Calling & Tavily Search
Multi-Expert Research Agent with ReAct Pattern
Multi-Agent Collaboration
Human-in-the-Loop & Persistence
Model Context Protocol (MCP) NEW
Agent Memory Systems NEW
Context Engineering
Harness Engineering
Practice — Agents chevron_right
Assignment: LangGraph Foundations & State Management Assignment: Multi-Expert ReAct Research Agent Assignment: Human-in-the-Loop & Persistence Assignment: FPT Customer Chatbot - Multi-Agent System Assignment: Harness Engineering Assignment: Context Engineering
Assessment — Agents chevron_right
Quiz for LangGraph and Agentic AI module Quiz for LangGraph and Agentic AI module Quiz for LangGraph and Agentic AI module Quiz for LangGraph and Agentic AI module Quiz Quiz Quiz & Appendix - Advanced
LLMOps expand_more
Evaluation Toolkit - Ragas
Observability: LangFuse & LangSmith
Experiment Comparison: Naive, Graph, Hybrid
AI Safety & Guardrails NEW
Building RAG Agent using LangChain
Practice — LLMOps chevron_right
Assignment: RAGAS Evaluation Metrics Assignment: RAG Architecture Experiment Comparison Assignment: LLM Observability with LangFuse & LangSmith
Assessment — LLMOps chevron_right
Quiz
AI Cheatsheets expand_more
Prompt Engineering Quick Reference
LLM API Cheatsheet
Embeddings & Vector Search
Exams expand_more
AI Theory Exams
AI Project Exams
Basic AI Fundamentals Quiz
Exam Theory: RAG and Optimization
Final Exam: Enterprise RAG System
Final Exam
Final Project Exam: FPT Customer Chatbot - Multi-Agent AI System
LLMOps and Evaluation Question Bank
Final Exam: Production-Ready RAG Evaluation System
Introduction
code What Is Software Engineering? expand_more
Software Engineering Training Program
Foundations expand_more
Web Concepts
Threads vs Processes
What asyncio is ?
greenlet
File Descriptors
Event Loop
CPython vs Pure Python
Practice — Foundations chevron_right
Write your first API apps 📘 Assignment: Day 1 – Foundation and First Steps with FastAPI Concurrency Code Exploration fork_sample_call_fn.py socket_server_sample.py ✔️ Preserves Liskov Substitution Principle
Assessment chevron_right
Foundations Assessment Quiz
API Development expand_more
FastAPI Intro
What ASGI Is
What are Path Parameters in FastAPI?
FastAPI Code Examples for Query Parameters
Header Parameters in FastAPI
Body in FastAPI
Data Modeling (Pydantic)
Server-Sent Events (SSE)
Dependency Injection
Practice — API Development chevron_right
Try to run FastAPI in Jupyter Notebook Async & Event Loop Exploration Assignment: Building a Product Inventory API with FastAPI and Pydantic main-memory-fastAPI.py Function Type vs Method Type Problem set the PYTHONPATH when debugging with F5 in Visual Studio Code (VS Code) Diagram 1. selectors.py and its design
Assessment — API Development chevron_right
Applied Assessment Quiz
Data Persistence expand_more
Install required packages
Async Python, Postgres, and SQLAlchemy
Alembic Introduction
CRUD Application Overview
Practice — Data Persistence chevron_right
Step-by-step: Add Connection in pgAdmin ~~Install Docker CE (On Mac)~~ Database Setup Reference Becareful 1. Core Concept Exercise: Setup PostgreSQL on Docker CE (not Desktop) Exercise: FastAPI + Async SQLAlchemy + Basic TODO CRUD Exercise: End-to-End Solution Exercise: Migrations (Alembic), Concurrency & Transactions Exercise: Pagination, Filtering, Validation & Error Handling env.py
Assessment — Data Persistence chevron_right
Integration Assessment Quiz
Security & Testing expand_more
What is JWT ?
What is OAuth 2.0
Google OAuth2 Authentication
Authentication Implementation Overview
Unit Testing FastAPI Applications
Practice — Advanced chevron_right
FastAPI Advanced Topics: Authentication, Authorization, and Testing FastAPI Advanced Topics: Authentication, Authorization, and Testing Assignment: Working With JWT in Python OAuth2 Python Assignment Solutions for JWT Assignment FULL SOLUTION — OAuth2 in Pure Python (GitHub Authorization Code Flow) Assignment 1 — Build a Full Test Suite for a FastAPI App JWT Implementation Reference init.py Pytest Reference Implementation
Assessment chevron_right
Advanced Assessment Quiz
Software Engineering Cheatsheets expand_more
Git Collaboration Workflow
Relational Databases
API Mastery: REST & Security
Testing Methodologies & TDD
CI/CD Pipelines
Docker Fundamentals
Secure Coding Practices
Code Review Practices
Container Orchestration with Docker Compose
Clean Architecture & Layering
Microservices vs Serverless
Exams expand_more
Final Project Exam: FPT Customer Chatbot - Backend API System
Introduction to Software Engineering
Software Engineering Appendix expand_more
Frontend Practice — Chatbot Backend API
cloud What Is Cloud & Infrastructure? expand_more
Cloud & Infrastructure Training Program
Foundations expand_more
Introduction to Cloud
Docker Fundamentals & Best Practices
Monolith vs. Microservices: Principles, Pros & Cons
Practice chevron_right
Assignments Containerization with Docker Assignment
Assessment chevron_right
Containerization with Docker - Quiz
Applied expand_more
Basic AWS Services Essential
CI/CD Automation Pipelines
Continuous Code Quality with SonarQube
Practice chevron_right
CI/CD and Deployment Assignment Continuous Code Quality with SonarQube Assignment
Assessment chevron_right
CI/CD and Deployment - Quiz Continuous Code Quality with SonarQube - Quiz
Integration expand_more
Implementing API Gateway
Message Queues with RabbitMQ
Practice
Assessment
Advanced expand_more
SAGA Pattern Concepts
Performance — Redis Caching
Observability
Review & E2E Debugging
Practice chevron_right
Assignments
Assessment
Cloud & Infrastructure Cheatsheets expand_more
AWS Core Services Quick Reference
Kubernetes Quick Reference
CI/CD Patterns Quick Reference
Managed ML Services Cheatsheet
Exams expand_more
Basic DevOps Essentials for Developer - Theory Exam
Project Exam
Quiz
Final Exam: Deploy FastAPI Application to AWS Cloud
Final Exam
Common Resources
menu_book Glossary settings Setup Guides folder_open Study Materials
description Release Notes expand_more
Content Changelog Platform Changelog
Get Involved
people Contributors bug_report Report an Issue open_in_new
  • What Is AI Engineering?
  • Agents
  • Tool Calling & Tavily Search

Tool Calling & Tavily Search#

This page explains how LLMs decide when and how to invoke external tools using structured function calling, with practical examples using the Tavily Search API to give agents access to real-time web information.

Learning Objectives#

  • Understand Tool/Function Calling in LLMs and how it works

  • Integrate external tools with agents effectively

  • Use Tavily Search API for optimized web search

  • Build agent with multiple tools and handle tool orchestration


1. What is Tool Calling?#

1.1 Concept#

Tool Calling (or Function Calling) is the ability of LLM to:

  • LLM decides when to use tools: Model automatically decides when to call tool based on user query

  • Structured tool invocation: Call tool with parameters formatted in standard format (JSON schema)

  • Parse tool results: Receive and process results returned from tool

  • Continue reasoning: Continue reasoning with new information to create final response

Basic Flow:

        graph LR
    UQ["User Query"] --> LLM["LLM Analyzes"]
    LLM --> DEC["Decides to Use Tool"]
    DEC --> CALL["Calls Tool<br/>with Params"]
    CALL --> EXEC["Tool Executes"]
    EXEC --> RES["Returns Result"]
    RES --> PROC["LLM Processes"]
    PROC --> FINAL["Final Response"]
    

1.2 Why Tool Calling?#

  • Extend LLM capabilities: Overcome knowledge cutoff and training data limitations

  • Access real-time data: Get real-time information (weather, stock prices, news)

  • Perform actions: Perform real actions (send email, create ticket, update database)

  • Integrate with APIs: Connect with external services and third-party APIs

1.3 Function Calling vs Tool Use#

Function Calling

Tool Use

OpenAI terminology

LangChain/Anthropic terminology

JSON schema for functions

Tool interface with description

Returns a function call object

Returns a tool invocation

Commonly used for OpenAI models

Framework-agnostic approach


2. OpenAI Tool Calling (Modern SDK)#

2.1 API overview#

Define tools with JSON schema. Note: OpenAI replaced the old functions= parameter with a unified tools= parameter; each tool is wrapped in a {"type": "function", "function": {...}} envelope:

tools = [
    {
        "type": "function",
        "function": {
            "name": "search_web",
            "description": "Search the web for current information about a topic",
            "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                        "type": "string",
                        "description": "The search query"
                    },
                    "num_results": {
                        "type": "integer",
                        "description": "Number of results to return",
                        "default": 5
                    }
                },
                "required": ["query"]
            }
        }
    }
]

2.2 Request#

from openai import OpenAI

client = OpenAI()  # picks up OPENAI_API_KEY from the environment

response = client.chat.completions.create(
    model="gpt-5.2",
    messages=[
        {"role": "user", "content": "What's the weather in Hanoi?"}
    ],
    tools=tools,
    tool_choice="auto",  # "auto", "none", "required", or {"type": "function", "function": {"name": "..."}}
)

2.3 Response#

# response.choices[0].message looks like:
{
    "role": "assistant",
    "content": None,
    "tool_calls": [
        {
            "id": "call_abc123",
            "type": "function",
            "function": {
                "name": "search_web",
                "arguments": '{"query": "weather Hanoi today"}'
            }
        }
    ]
}

2.4 Executing the tool#

import json

assistant_msg = response.choices[0].message
tool_call = assistant_msg.tool_calls[0]
function_name = tool_call.function.name
arguments = json.loads(tool_call.function.arguments)

if function_name == "search_web":
    result = search_web(**arguments)

2.5 Continuation — send the tool result back#

messages.append(assistant_msg)  # preserve the tool_call in history
messages.append({
    "role": "tool",
    "tool_call_id": tool_call.id,
    "content": json.dumps(result),
})

final_response = client.chat.completions.create(
    model="gpt-5.2",
    messages=messages,
)

Note: OpenAI’s newer Responses API (client.responses.create(...)) is now the recommended surface for agent-style workloads. The Chat Completions API shown above is still fully supported and is the simplest path for understanding the tool-calling loop.


3. LangChain Tools#

3.1 Tool Interface#

from langchain.tools import Tool

def search_function(query: str) -> str:
    """Search implementation"""
    # Call actual search API
    results = tavily_client.search(query)
    return str(results)

search_tool = Tool(
    name="WebSearch",
    func=search_function,
    description="Useful for searching the web for current information. Input should be a search query string."
)

3.2 @tool Decorator#

from langchain.tools import tool

@tool
def calculator(expression: str) -> str:
    """Useful for performing mathematical calculations.
    Input should be a valid Python mathematical expression."""
    try:
        result = eval(expression)
        return f"Result: {result}"
    except Exception as e:
        return f"Error: {str(e)}"

@tool
def get_current_time(timezone: str = "UTC") -> str:
    """Get current time in specified timezone.
    Input should be timezone string like 'UTC', 'Asia/Ho_Chi_Minh'."""
    from datetime import datetime
    import pytz
    tz = pytz.timezone(timezone)
    return datetime.now(tz).strftime("%Y-%m-%d %H:%M:%S %Z")

3.3 Built-in Tools#

# DuckDuckGo Search
from langchain_community.tools import DuckDuckGoSearchResults
search = DuckDuckGoSearchResults()

# Wikipedia
from langchain_community.tools import WikipediaQueryRun
from langchain_community.utilities import WikipediaAPIWrapper
wikipedia = WikipediaQueryRun(api_wrapper=WikipediaAPIWrapper())

# Python REPL
from langchain_community.tools import PythonREPLTool
python_repl = PythonREPLTool()

# File Management
from langchain_community.tools import ReadFileTool, WriteFileTool
read_file = ReadFileTool()
write_file = WriteFileTool()

4. Tavily Search API#

4.1 Introduction to Tavily#

Tavily is a search engine optimized for AI:

  • AI-optimized search engine: Results are pre-formatted for LLMs

  • Designed for LLMs and RAG: Easy integration with AI workflows

  • Clean, relevant results: Filters out noise, only returns quality content

  • Real-time web search: Information updated in real-time

4.2 Features#

  • Web search: Search entire web with optimized ranking

  • News search: Specialized for latest news

  • Answer mode: Returns direct answers instead of a list of links

  • Context optimization: Optimize context for RAG applications

4.3 Getting Started#

# 1. Sign up at tavily.com
# 2. Get API key from dashboard
# 3. Install SDK
pip install tavily-python

4.4 Basic Usage#

from tavily import TavilyClient

# Initialize client
client = TavilyClient(api_key="tvly-xxxxxxxxxxxxx")

# Basic search
response = client.search(
    query="LangGraph tutorial 2025"
)

print(response["results"])

4.5 Search Parameters#

response = client.search(
    query="climate change solutions",           # Required
    search_depth="advanced",                    # "basic" or "advanced"
    max_results=10,                             # Max number of results
    include_domains=["edu", "gov"],             # Filter domains
    exclude_domains=["example.com"],            # Block domains
    include_answer=True,                        # Get AI-generated answer
    include_raw_content=False,                  # Include full page content
    include_images=True                         # Include image URLs
)

4.6 Response Structure#

{
  "query": "climate change solutions",
  "follow_up_questions": null,
  "answer": "Several effective climate change solutions include...",
  "images": ["https://...", "https://..."],
  "results": [
    {
      "title": "10 Solutions for Climate Change",
      "url": "https://example.com/article",
      "content": "Clean summary of the page content...",
      "score": 0.98,
      "raw_content": null
    }
  ],
  "response_time": 1.23
}

5. TavilySearch Tool#

5.1 LangChain Integration#

Tavily ships an official LangChain integration via the langchain-tavily package (the older langchain_community.tools.tavily_search path is deprecated):

pip install -U langchain-tavily
import os
from langchain_tavily import TavilySearch

# Set API key
os.environ["TAVILY_API_KEY"] = "tvly-xxxxxxxxxxxxx"

# Create the tool
search = TavilySearch(
    max_results=5,
    search_depth="advanced",
    include_answer=True,
    include_raw_content=False,
    include_domains=[],
    exclude_domains=[],
)

# Use the tool
result = search.invoke({"query": "latest AI developments"})

Drop the resulting search tool into any create_agent(..., tools=[search]) call to give the agent live web access.

6. Advanced Tool Patterns#

6.1 Tool Chaining#

Output of one tool becomes input for another:

@tool
def search_company(company_name: str) -> str:
    """Search for company information"""
    return search.invoke(f"{company_name} official website")

@tool
def get_stock_price(ticker: str) -> str:
    """Get stock price from ticker symbol"""
    # ticker extracted from search_company result
    return f"${price}"

# Chain: company name → search → extract ticker → get price

7. Error Handling#

7.1 API Failures#

from tenacity import retry, stop_after_attempt, wait_exponential

@retry(
    stop=stop_after_attempt(3),
    wait=wait_exponential(multiplier=1, min=2, max=10)
)
def call_tavily_with_retry(query):
    try:
        return client.search(query)
    except Exception as e:
        print(f"API call failed: {e}")
        raise

7.2 Timeout Handling#

import asyncio

async def execute_with_timeout(tool_func, timeout=30):
    try:
        result = await asyncio.wait_for(
            tool_func(),
            timeout=timeout
        )
        return result
    except asyncio.TimeoutError:
        return "Tool execution timeout"

8. Optimization#

8.1 Caching Tool Results#

from functools import lru_cache
import hashlib

# Simple LRU cache
@lru_cache(maxsize=100)
def cached_search(query: str):
    return client.search(query)

from datetime import datetime, timedelta

cache = {}
CACHE_TTL = timedelta(hours=1)

def search_with_cache(query):
    cache_key = hashlib.md5(query.encode()).hexdigest()

    if cache_key in cache:
        result, timestamp = cache[cache_key]
        if datetime.now() - timestamp < CACHE_TTL:
            return result

    result = client.search(query)
    cache[cache_key] = (result, datetime.now())
    return result

8.2 Rate Limiting#

from ratelimit import limits, sleep_and_retry

# Max 10 calls per minute
@sleep_and_retry
@limits(calls=10, period=60)
def rate_limited_search(query):
    return client.search(query)

8.3 Batching Requests#

async def batch_search(queries: list[str]):
    """Execute multiple searches efficiently"""
    tasks = [client.search_async(q) for q in queries]
    results = await asyncio.gather(*tasks)
    return results

9. Custom Tools#

9.1 Creating Custom Tool#

from langchain.tools import BaseTool
from pydantic import BaseModel, Field

class SearchInput(BaseModel):
    query: str = Field(description="Search query")
    domain: str = Field(description="Specific domain to search")

class DomainSearchTool(BaseTool):
    name = "domain_search"
    description = "Search within a specific domain"
    args_schema = SearchInput

    def _run(self, query: str, domain: str) -> str:
        """Synchronous implementation"""
        full_query = f"site:{domain} {query}"
        return client.search(full_query)

    async def _arun(self, query: str, domain: str) -> str:
        """Async implementation"""
        full_query = f"site:{domain} {query}"
        return await client.search_async(full_query)

10. Security Considerations#

10.1 API Key Management#

# ❌ Never hardcode
client = TavilyClient(api_key="tvly-xxxxx")

# ✅ Use environment variables
import os
from dotenv import load_dotenv

load_dotenv()
client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))

# ✅ Use secret management
from azure.keyvault.secrets import SecretClient
api_key = secret_client.get_secret("tavily-api-key").value

11. Best Practices#

11.1 Tool Descriptions#

# ❌ Vague description
@tool
def search(query: str) -> str:
    """Search the web"""
    pass

# ✅ Clear, detailed description
@tool
def search(query: str) -> str:
    """
    Search the web for current information using Tavily API.
    Best used for: recent news, current events, factual information.
    Input should be a clear, specific search query.
    Returns: Top 5 relevant web results with summaries.
    """
    pass

Structured Outputs & Modern Tool Patterns NEW#

Structured Outputs (JSON Schema Enforcement)#

Modern LLMs now support constrained decoding — guaranteeing that outputs conform to a JSON schema:

  • OpenAI: response_format: {"type": "json_schema", "json_schema": {...}, "strict": true} — uses constrained decoding for 100% schema compliance

  • Anthropic Claude: Uses tool_use with high reliability; also supports JSON mode via system prompts

  • Self-hosted (vLLM): XGrammar backend supports JSON Schema, Pydantic, and regex constraints at inference time

# OpenAI Structured Output with Pydantic
from pydantic import BaseModel
from openai import OpenAI

class CompanyInfo(BaseModel):
    name: str
    industry: str
    revenue_usd: float
    employee_count: int

client = OpenAI()
result = client.beta.chat.completions.parse(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me about Anthropic"}],
    response_format=CompanyInfo,
)
company = result.choices[0].message.parsed  # CompanyInfo instance

Decision guide: pure data extraction → structured outputs; agentic multi-action → function/tool calling.

Model Context Protocol (MCP)#

MCP is an open standard (initiated by Anthropic, now governed by the Linux Foundation) that provides a universal interface for connecting LLMs to external tools and data sources.

Key concepts:

  • MCP Server: Exposes tools and resources via a standardized protocol

  • MCP Client: The LLM host (Claude, VS Code, Cursor) that connects to servers

  • Transport: Communication layer (Streamable HTTP for remote, stdio for local)

        graph LR
    LLM["LLM Client<br/>(Claude, Cursor, VS Code)"] <-->|"MCP Protocol"| S1["MCP Server<br/>GitHub"]
    LLM <-->|"MCP Protocol"| S2["MCP Server<br/>Database"]
    LLM <-->|"MCP Protocol"| S3["MCP Server<br/>Slack"]
    

Why MCP matters:

  • 10,000+ public MCP servers as of early 2026

  • Adopted by OpenAI, Google, VS Code, Cursor, Windsurf, GitHub Copilot

  • Replaces one-off API integrations with a universal standard

  • Think of it as “USB-C for AI tools” — one protocol to connect everything

MCP is covered in detail in the Advanced tier.

Parallel Tool Execution#

Modern agents can invoke multiple tools simultaneously when the tools are independent:

# LangGraph parallel tool execution
from langgraph.prebuilt import ToolNode

tool_node = ToolNode([search_tool, calculator_tool, weather_tool])
# When the LLM returns multiple tool_calls in one response,
# ToolNode executes them in parallel automatically

This significantly reduces latency for multi-tool queries (e.g., “What’s the weather AND stock price?”).


12. PRACTICE#

  • From Research Agent in module 02, add another agent called Web Search tool (Tavily) to handle task for web search request AND coordinator must advise user to use web search for research request.

    • Web search tool must be able to call in parallel along with other tools

35 of 84 in AI

previous

LangGraph Foundations & State Management

next

Multi-Expert Research Agent with ReAct Pattern

On this page
  • Learning Objectives
  • 1. What is Tool Calling?
    • 1.1 Concept
    • 1.2 Why Tool Calling?
    • 1.3 Function Calling vs Tool Use
  • 2. OpenAI Tool Calling (Modern SDK)
    • 2.1 API overview
    • 2.2 Request
    • 2.3 Response
    • 2.4 Executing the tool
    • 2.5 Continuation — send the tool result back
  • 3. LangChain Tools
    • 3.1 Tool Interface
    • 3.2 @tool Decorator
    • 3.3 Built-in Tools
  • 4. Tavily Search API
    • 4.1 Introduction to Tavily
    • 4.2 Features
    • 4.3 Getting Started
    • 4.4 Basic Usage
    • 4.5 Search Parameters
    • 4.6 Response Structure
  • 5. TavilySearch Tool
    • 5.1 LangChain Integration
  • 6. Advanced Tool Patterns
    • 6.1 Tool Chaining
  • 7. Error Handling
    • 7.1 API Failures
    • 7.2 Timeout Handling
  • 8. Optimization
    • 8.1 Caching Tool Results
    • 8.2 Rate Limiting
    • 8.3 Batching Requests
  • 9. Custom Tools
    • 9.1 Creating Custom Tool
  • 10. Security Considerations
    • 10.1 API Key Management
  • 11. Best Practices
    • 11.1 Tool Descriptions
  • Structured Outputs & Modern Tool Patterns NEW
    • Structured Outputs (JSON Schema Enforcement)
    • Model Context Protocol (MCP)
    • Parallel Tool Execution
  • 12. PRACTICE
AI Study Roadmap AI Study Roadmap

Last updated on 16/04/2026 14:22 (UTC+7).

Copyright © 2025-2026 FSOFT.FHN.NGT AI Vanguard team.