OpenAPI to Function Calling: A Comprehensive Guide #
Imagine a world where machines can talk to each other seamlessly, sharing data and insights like old friends swapping stories. Sounds like science fiction, right? Well, we’re about to make it a reality!
In this blog post, we’re going to take you on a thrilling adventure that combines two revolutionary concepts: OpenAPI specifications and function calling in language models. Buckle up, because we’re about to show you how these two game-changers can come together to create systems that are not only intelligent but also flexible and adaptable.
So, what are you waiting for? Let’s dive into the world of APIs, AI, and the future of intelligent systems!
Understanding OpenAPI Specifications #
OpenAPI (formerly known as Swagger) is a standard way to describe RESTful APIs. It’s like a blueprint for an API, telling developers what the API can do, what data it expects, and what it will return.
Key components of an OpenAPI spec include:
- Paths: The different endpoints of your API
- Methods: What actions can be performed (GET, POST, etc.)
- Parameters: What data the API expects
- Responses: What the API will return
Here’s a simple example of an OpenAPI specification:
openapi: 3.0.0
info:
title: Sample API
version: 1.0.0
paths:
/users:
get:
summary: Get all users
responses:
'200':
description: Successful response
post:
summary: Create a user
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
name:
type: string
email:
type: string
responses:
'201':
description: User created
Function Calling in Language Models #
Function calling is a feature in advanced language models like GPT-4. However, it’s important to understand what this feature actually does and what it doesn’t do.
What Function Calling Is #
Function calling in language models is essentially a way to generate structured outputs. When we say the model can “call functions,” what we really mean is:
- The model is given information about one or more functions, including their names, parameters, and descriptions.
- When prompted, the model can generate an output that specifies:
- Which function should be called
- What arguments should be passed to that function
This output is formatted in a specific, structured way that makes it easy for a program to parse and use.
What Function Calling Isn’t #
It’s crucial to understand that the language model does not actually execute any functions. The model merely suggests a function call and its arguments. The actual execution of the function must be done by the application using the model’s output.
How It Works #
Here’s a step-by-step breakdown of how function calling works:
-
Function Definition: You define one or more functions that the model can “call”. These definitions include the function name, parameters, and a description of what the function does.
-
Model Input: When you send a prompt to the model, you also send these function definitions.
-
Model Output: The model analyzes the prompt and the function definitions. If it determines that calling a function would be appropriate, it generates an output specifying which function to call and with what arguments.
-
Parsing the Output: Your application receives this structured output and parses it to determine which function the model suggested and what arguments it proposed.
-
Actual Execution: If appropriate, your application then actually calls the real function with the suggested arguments.
Example #
Let’s say we define a function like this:
{
"name": "get_weather",
"description": "Get the current weather in a given location",
"parameters": {
"type": "object",
"properties": {
"location": {
"type": "string",
"description": "The city and state, e.g. San Francisco, CA"
},
"unit": {
"type": "string",
"enum": ["celsius", "fahrenheit"]
}
},
"required": ["location"]
}
}
If we then ask the model “What’s the weather like in New York?”, it might generate an output like this:
{
"function": "get_weather",
"arguments": {
"location": "New York, NY",
"unit": "fahrenheit"
}
}
This output doesn’t actually get the weather - it just suggests calling a function that would get the weather. Your application would then need to have an actual get_weather
function that it calls with these arguments to retrieve the real weather data.
Why It’s Useful #
Despite not actually executing functions, this feature is extremely powerful because it allows the model to:
- Understand and respond to user intents in a structured way
- Interface with external systems and APIs through your application
- Provide consistent, parseable outputs for complex queries
By combining this with OpenAPI specifications, we can create systems where AI models can effectively “use” APIs based on their documentation, opening up a world of possibilities for AI-driven automation and interaction.
The Magic: Converting OpenAPI to Functions #
Now, here’s where it gets exciting. We can take an OpenAPI specification and convert it into a format that a language model can understand for function calling. This means the AI can effectively “use” the API!
Core Idea #
Basic Implementation #
Let’s break down the logic:
- Create an OpenAPIToFunctionConverter class: This class takes an OpenAPI spec and converts it to a list of functions the AI can understand.
class OpenAPIToFunctionConverter:
def __init__(self, spec: Dict[str, Any]):
self.spec = spec
def convert_to_functions(self) -> List[Dict[str, Any]]:
functions = []
paths = self.spec.get('paths', {})
for path, methods in paths.items():
for method, details in methods.items():
function = self._create_function(path, method, details)
functions.append(function)
return functions
def _create_function(self, path: str, method: str, details: Dict[str, Any]) -> Dict[str, Any]:
function = {
"name": details.get('operationId') or f"{method}_{self._path_to_name(path)}",
"description": details.get('summary', details.get('description', 'No description provided')),
"parameters": {
"type": "object",
"properties": {},
"required": []
}
}
# Handle path parameters
path_params = [p for p in details.get('parameters', []) if p['in'] == 'path']
for param in path_params:
self._add_parameter(function, param)
# Handle query parameters
query_params = [p for p in details.get('parameters', []) if p['in'] == 'query']
for param in query_params:
self._add_parameter(function, param)
# Handle request body
if 'requestBody' in details:
content = details['requestBody'].get('content', {})
schema = next(iter(content.values())).get('schema', {}) if content else {}
if schema.get('type') == 'object':
for prop_name, prop_details in schema.get('properties', {}).items():
self._add_parameter(function, {
'name': prop_name,
'schema': prop_details,
'required': prop_name in schema.get('required', [])
})
return function
def _handle_schema(self, schema: Dict[str, Any]) -> Dict[str, Any]:
if 'allOf' in schema:
# Combine all subschemas
return self._combine_schemas(schema['allOf'])
elif 'anyOf' in schema or 'oneOf' in schema:
# Use the first schema as an example
return self._handle_schema(schema.get('anyOf', schema.get('oneOf'))[0])
else:
return schema
def _combine_schemas(self, schemas: List[Dict[str, Any]]) -> Dict[str, Any]:
combined = {}
for schema in schemas:
combined.update(self._handle_schema(schema))
return combined
def _add_parameter(self, function: Dict[str, Any], param: Dict[str, Any]):
param_schema = self._handle_schema(param.get('schema', {}))
function["parameters"]["properties"][param['name']] = {
"type": param_schema.get('type', 'string'),
"description": param.get('description', 'No description provided')
}
if param_schema.get('enum'):
function["parameters"]["properties"][param['name']]["enum"] = param_schema['enum']
if param.get('required', False):
function["parameters"]["required"].append(param['name'])
@staticmethod
def _path_to_name(path: str) -> str:
return path.replace('/', '_').strip('_').replace('{', '').replace('}', '')
def get_functions(self) -> List[Dict[str, Any]]:
return self.convert_to_functions()
# Usage
with open("api_spec.yaml") as f:
raw_openapi_spec = yaml.safe_load(f)
converter = OpenAPIToFunctionConverter(raw_openapi_spec)
functions = converter.get_functions()
print(functions)
Output:
[
{
"name": "get_users",
"description": "Get all users",
"parameters": {
"type": "object",
"properties": {},
"required": []
}
},
{
"name": "post_users",
"description": "Create a user",
"parameters": {
"type": "object",
"properties": {
"name": {
"type": "string",
"description": "User's name"
},
"email": {
"type": "string",
"description": "User's email"
}
},
"required": ["name", "email"]
}
}
]
- Create an LLMFunctionCaller class: This class uses the OpenAI API to call the language model with the converted functions.
import os
import openai
from dotenv import load_dotenv
load_dotenv()
class LLMFunctionCaller:
def __init__(self, api_key: str, model: str = "gpt-4o-mini"):
openai.api_key = api_key
self.model = model
def call_llm_with_functions(self, prompt: str, functions: List[Dict[str, Any]]):
try:
response = openai.chat.completions.create(
model=self.model,
messages=[{"role": "user", "content": prompt}],
functions=functions,
function_call="auto"
)
return response.choices[0].message
except Exception as e:
print(f"An error occurred: {e}")
return None
# Usage
llm_caller = LLMFunctionCaller(os.getenv('OPENAI_API_KEY'))
prompt = "Get information about all users"
response = llm_caller.call_llm_with_functions(prompt, functions)
print("LLM Response:")
print(response)
Output:
LLM Response:
{
"function_call": {
"name": "get_users",
"arguments": "{}"
},
"content": null
}
Implementation using Langchain #
Langchain simplifies this process:
import yaml
from langchain.agents.agent_toolkits.openapi import planner
from langchain_community.utilities import RequestsWrapper
from langchain_openai import ChatOpenAI
from dotenv import load_dotenv
from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec
load_dotenv()
ALLOW_DANGEROUS_REQUEST = True
# Load the OpenAPI spec
with open("api_spec.yaml") as f:
raw_openapi_spec = yaml.safe_load(f)
# Reduce the OpenAPI spec dict
reduced_spec = reduce_openapi_spec(raw_openapi_spec)
# Create the agent
agent = planner.create_openapi_agent(
openai_api_spec,
RequestsWrapper(headers={}),
llm,
allow_dangerous_requests=ALLOW_DANGEROUS_REQUEST,
)
# Use the agent
agent.invoke("Get information about all users")
Output:
> Entering new AgentExecutor chain...
I need to use the GET /users endpoint to retrieve information about all users.
Action: get_users
Action Input: {}
> Entering new RequestsChain chain...
Entering GET /users with:
headers: {}
params: {}
data: {}
Response: {"status": "success", "data": [{"id": 1, "name": "John Doe", "email": "john@example.com"}, {"id": 2, "name": "Jane Smith", "email": "jane@example.com"}]}
> Finished chain.
The GET /users endpoint returned information about all users. Here's a summary of the data:
1. User ID: 1
Name: John Doe
Email: john@example.com
2. User ID: 2
Name: Jane Smith
Email: jane@example.com
Is there anything specific you'd like to know about these users?
> Finished chain.
Practical Use Cases #
This OpenAPI to function calling approach has numerous applications:
- Chatbots for API Interaction: Create chatbots that can interact with complex APIs using natural language.
- Automated Testing: Generate test cases for APIs based on their specifications.
- API Exploration: Allow developers to explore and understand APIs through natural language queries.
- Workflow Automation: Create complex workflows that interact with multiple APIs using AI-driven decision-making.
Conclusion #
The combination of OpenAPI specifications and function calling in language models opens up exciting possibilities for creating more intelligent and user-friendly systems. Whether you’re building a chatbot, automating workflows, or exploring APIs, this approach can significantly simplify your development process and enhance the capabilities of your applications.
Start exploring the powerful world of AI-driven API interactions!