Building Ahead: a multi-tool AI Travel Agent with OpenAI + SerpApi

Modern travel planning demands fresh, structured signals and predictable automation. Flight availability, hotel inventory, and local recommendations are time-sensitive and often encoded in vertical search outputs. AI agents can directly assist in solving this problem. A reliable AI travel agent will combine the reasoning capabilities of a language model with direct access to different search APIs.

This post describes a follow-up to the research agent pattern (covered in prior AI agent blog): a travel planning agent that plans its retrievals, verifies structured inputs (IATA codes), runs targeted vertical searches (Flights, Hotels, Local/Maps, Web), and synthesizes concise, cited itineraries.

Full code available here: https://github.com/serpapi/travel-planning-agent

Use cases

The travel agent’s goal is to help travelers discover new destinations faster and plan trips better. It also very cheap to run in comparison to hiring a travel agency or travel advisors.

Itinerary assembly: curate flight options, hotel availability, possible travel experiences and return a brief, actionable plan with source links.
Discovery workflows: surface destination ideas when only soft constraints are provided (season, tone, budget).
Constraint-driven planning: honor explicit limits (max price, cabin class, passenger mix, all-inclusive) and produce ranked options.
Audit and reproducibility: save a JSON trace of the model’s planned tool calls and returned snippets for debugging, compliance, or user review.

Each use case benefits from the same core pattern: explicit planning by the model, concurrent retrieval by the host, and a single synthesis step that ties everything together with citations.

How it works — high level

Three coordinated stages form the backbone of the agent:

Plan. The model emits a batch of structured tool calls that enumerate the data required — IATA lookups, candidate flight date windows, hotel queries for target neighborhoods, and local POI searches. The set of tool calls is produced before any external requests are executed so coverage is explicit and auditable.

Execute. The host executes the model’s tool calls. Independent calls are dispatched concurrently to reduce latency (thread pool or async execution). Each tool returns compact, structured snippets (title/snippet/price/link) which are appended to the conversation as tool messages and associated with the originating tool_call_id.

Synthesize. With the retrieved snippets in context the model composes a final answer: a concise itinerary, ranked flight/hotel options, local recommendations, and footnote-style citations that map directly to the returned URLs.

This plan → execute → synthesize loop preserves an auditable trace and minimizes token waste while ensuring the model reasons over current, structured data.

Minimal setup

Python 3.9+
Environment variables: OPENAI_API_KEY, SERPAPI_API_KEY
Libraries: openai (or the official OpenAI SDK used for function/tool calls) and serpapi for vertical search access

You can install necessary libraries using:

pip install -r requirements.txt

export OPENAI_API_KEY="..."
export SERPAPI_API_KEY="..."

You can obtain SerpApi API key on the SerpApi website. There is a free plan that offers 250 searches / month, so you can freely test the agent.

A CLI wrapper supports interactive chat and one-shot queries and can optionally persist the full JSON trace for auditing.

Key operational rules

The agent follows a small set of explicit operational rules encoded in the system prompt and enforced by the host:

IATA verification: the flights tool requires 3-letter IATA codes. Before calling the flights endpoint the model must resolve and verify airport codes via a web lookup (e.g., "Warsaw IATA code → WAW"). This prevents invalid flight queries and reduces tool errors.
Date disambiguation: ambiguous dates are interpreted as future travel. If a referenced month has already passed in the current year, the agent interprets it as next year. For vague phrasing such as “mid-May,” the agent probes a small date window (for example 13–17 May) and searches multiple candidate windows rather than a single day.
Reasonable defaults: when the user omits details, the agent assumes sensible defaults and surfaces them in the response (example defaults: economy cabin, up to one stop, 2 adults, ±3 days flexibility). Tone in the query (e.g., “luxury”) adjusts defaults (premium cabins, higher star hotels).
Batching and parallelism: the model should emit all needed tool calls in a single assistant message when multiple external queries are required. The host executes independent calls concurrently to reduce latency.
Transparency and citations: final outputs include footnote-style citations that link back to the URLs returned by the vertical APIs. The JSON trace preserves tool_call_id → result associations for reproducibility.

Encoding these rules as part of the system prompt plus light host-side validation produces predictable, auditable behavior and fewer failed calls.

Tooling integration

The model is equipped with multiple tools that it has access to. For each tool a schema is defined that the model should output in order to request results from a certain tool. Below we show Google Flights tool integration, since it's one of the core tools to plan the travel. Other tools are integrated in a similar fashion.

# Representative tool schema (flights) and host-side mapping (conceptual)
# Model-facing function schema (what the LLM can call)
{
  "type": "function",
  "function": {
    "name": "search_flights",
    "description": "Find flights given IATA airport codes and dates (departure_date required).",
    "parameters": {
      "type": "object",
      "properties": {
        "departure": {"type": "string"},
        "destination": {"type": "string"},
        "departure_date": {"type": "string", "description": "YYYY-MM-DD"},
        "return_date": {"type": "string", "description": "YYYY-MM-DD (optional)"},
        "cabin": {"type": "string"},
        "max_price": {"type": "string"}
      },
      "required": ["departure", "destination", "departure_date"]
    }
  }
}

# Host-side behavior (conceptual):
# 1) Validate IATA codes (3 letters). If not present, run search_web to resolve.
# 2) Normalize dates and params.
# 3) Call SerpApi Flights engine:
#    params = {"engine":"google_flights","api_key":SERPAPI_API_KEY,
#              "departure_id":dep_code,"arrival_id":arr_code,"outbound_date":departure_date, ...}
# 4) Normalize returned results to: {price, total_duration_min, legs, link}
# 5) Append normalized JSON to conversation as a `tool` message with the tool_call_id.

This integration pattern enforces a clear contract: the model requests flights using the declared schema, the host validates and materializes the request against SerpApi, and the model receives compact, normalized results suitable for synthesis and citation.

The following tools are integrated. Each tool is mapped to a specific SerpApi endpoint.

Google Flights (engine=google_flights) yields structured flight listings (price, duration, legs) and a canonical flights result URL.
Google Hotels (engine=google_hotels) returns property metadata, ratings, and snippet links to booking surfaces.
Google Local/Maps (engine=google_local) supplies POIs, ratings, types, and map links.
Google organic (engine=google) supports general web lookups (IATA lookups, policy pages, travel advisories).

This approach to integration keeps token consumption predictable; an optional “deep scrape” tool can be added later for full-page context when necessary.

Inference loop

The inference loop is structured in the following sequence:

Append system prompt and user message to the conversation.
Request a model completion with the function/tool schema.
If the model returns a tool_calls assistant message, append it and execute those calls:

Validate dependent calls (for example, IATA resolution) before invoking dependent tools.
Run non-dependent calls concurrently; append results as tool messages associated with their tool_call_id.
Reinvoke the model to synthesize a final answer using the returned snippets.
If the model requests further tool calls, repeat; otherwise return the final synthesis.

Preserving the ordering and identifiers between tool calls and results is essential for accurate citations and for saving a reproducible JSON trace.

Behavior, defaults and error handling

Assumptions must be surfaced. Any default assumptions used to produce results are shown in the final answer (for example: “assumed economy, 2 adults, flexible ±3 days”). If the missing detail would materially change results (passenger mix, max price), the agent asks a concise clarifying question rather than guessing.
Cache small, stable lookups. IATA lookups, frequent POIs, and static data should be cached locally to reduce API usage and speed repetitive queries.
Quota and rate limiting. Batched concurrent retrievals reduce latency but increase burst usage. Implement simple rate limiting or token bucket strategies around SerpApi calls to avoid quota issues.
Graceful tool errors. If a tool returns an error (invalid arguments, rate limit), the host returns a short structured error snippet to the model and allows the model to either retry with corrected arguments or ask the user for clarification.

Example flow

User query: Weekend in Rome mid-May, leaving Lisabon.

Planner (model) emits a small batch of tool calls:

search_web("Lisabon IATA code")
search_web("Rome IATA code")
search_hotels(destination="Rome", check_in=2026-05-15, check_out=2026-05-17) (candidate window)
search_places(query="Colosseum Rome", location="Rome", limit=5)
several search_flights calls for 2–3 Friday→Sunday windows once IATA codes are resolved

Executor runs IATA lookups first, then runs hotels and places concurrently while issuing the validated flight calls. Results are returned as compact snippets. Synthesizer writes a short itinerary with three flight options, two hotel picks, and local highlights, each with footnote links to the SerpApi results. The response explicitly notes assumed defaults and suggests refinement steps.

CLI and programmatic usage

A small CLI wrapper supports two modes:

Interactive chat for iterative discovery and follow-up refinement.
One-shot query for quick plans, with an option to persist the JSON trace (--outfile) for auditing.

Typical commands:

# interactive
python travel_planning_agent.py

# one-shot, save trace
python travel_planning_agent.py -q "Luxury honeymoon Bali December" -o trace.json

--debug exposes intermediate tool calls and SerpApi responses for prompt tuning and debugging.

Limitations and future extension points

Booking flow: moving from planning to booking requires partner APIs and additional compliance considerations (payment, PII handling).
Richer passenger modeling: support for infants, children, special assistance and multi-party trips increases complexity in passenger constraints and pricing models.
Personalization: persist user preferences (airlines, seat class, hotel loyalty numbers) to bias search and filter results.
Explainability UI: render the JSON trace with click-through snippets so end users or auditors can inspect every tool call and its returned evidence.
Broader data sources and destinations: to create a really powerful agent capable of planning your next trip, expansion of data sources will be necessary. Integration of cruise lines, private transfers, travel perk / discount discovery, better coverage of non-default travel options such as road trips, themed parks, natural parks, etc. Coverage of additional options like travel insurance, upgrades, group travel, etc. are all necessary to create a high-quality getaway planner.

Conclusion

The research-agent pattern (plan → execute → synthesize) adapts naturally to travel planning when combined with vertical search APIs. Encoding a small set of operational rules (IATA verification, date logic, batching) and returning compact, cited snippets produces reliable, actionable itineraries with an auditable trace. Batched planning plus concurrent execution reduces latency and improves coverage, while proper validation and caching reduce failures and API costs. The resulting travel planner agent provides a practical foundation for booking integrations, personalization, and richer optimization in future work.