Tool use, function calling, multi-step planning
An LLM that only writes text is a glorified autocomplete. An LLM that can call tools, read structured data back, and decide what to do next is an agent — and the gap between those two things is where every interesting hospitality AI workflow lives. The terminology has settled in the past 18 months: "tool use" or "function calling" is the mechanism, "multi-step planning" is what you build on top of it.
What function calling actually is
You give the model a list of functions it can call — each with a name, a description, and a typed schema for inputs and outputs. The model decides, based on the user prompt, which function to call and with what arguments. You execute the function in your own code, return the result, and the model uses that result to decide the next step. The function spec on OpenAI, Anthropic, and Google Gemini has converged on a similar JSON-schema shape; the MCP (Model Context Protocol) released by Anthropic in late 2024 is now the cross-vendor wrapper that most serious deployments are moving to.
Concretely, a reservation-modification agent at a 180-room city hotel gets four functions: lookup_reservation(confirmation_code), check_availability(date_range, room_type), modify_reservation(reservation_id, changes), and send_confirmation(reservation_id, channel). The model receives a guest message in any language, decides which functions to call in what order, and executes a flow that used to take a reservations agent four minutes.
Multi-step planning, in practice
A single function call is not an agent. A chain of three or four calls, where the output of one decides the input of the next, is. Most useful hospitality agents are 3-7 steps: parse intent, look up state, check constraints, execute action, log result, notify human. The hard part is not the individual calls — it is the planning layer that decides which calls to make in which order and what to do when one fails.
Two patterns dominate. The first is "ReAct" — the model reasons in plain text about what to do next, then calls a tool, then reasons again. Latency is higher (4-12 seconds per step), but the reasoning is auditable. The second is "structured planning" — the model produces a plan upfront as a JSON object, then executes each step. Faster, but harder to recover when a step fails. For hospitality, ReAct wins almost every time because the audit trail matters and the latency budget is forgiving (a guest waiting 20 seconds for a reservation modification is fine; a guest waiting 20 seconds in a chat conversation is not).
What this costs
Cost-per-agent-run for a typical 5-step hospitality agent on Claude Sonnet or GPT-4o is around €0.02 to €0.08 depending on context length. At 600 runs per month across a 180-room property, that is €12-50 per month in model costs. The savings from one fewer FTE on routine reservation modifications is €2,400 per month. The economics are obvious; the engineering is the work.
What this is not
It is not "ChatGPT for hotels." It is not a chatbot that hands off to a human when it gets stuck. It is software that takes action against your PMS, channel manager, and email system on behalf of the property — with logs, rollback, and human review for the cases that need it. Every word in the rest of this course assumes you understand that distinction. If you are still thinking about agents as fancy chatbots, you will misjudge both the value and the risk.
I shipped my first reservation-modification agent at a chain in Warsaw in 2023. It handled 40% of inbound modifications within four weeks. The other 60% kept failing in interesting ways — and that is where this course actually starts.