Home > Curiosities > OpenAPI to Function Calling: A Comprehensive Guide

OpenAPI to Function Calling: A Comprehensive Guide

Learn how to convert OpenAPI specs to AI-usable functions for smarter API interactions.

July 27, 2024 · 8 min read

OpenAPI to Function Calling: A Comprehensive Guide #

Imagine a world where machines can talk to each other seamlessly, sharing data and insights like old friends swapping stories. Sounds like science fiction, right? Well, we’re about to make it a reality!

In this blog post, we’re going to take you on a thrilling adventure that combines two revolutionary concepts: OpenAPI specifications and function calling in language models. Buckle up, because we’re about to show you how these two game-changers can come together to create systems that are not only intelligent but also flexible and adaptable.

So, what are you waiting for? Let’s dive into the world of APIs, AI, and the future of intelligent systems!

Understanding OpenAPI Specifications #

OpenAPI (formerly known as Swagger) is a standard way to describe RESTful APIs. It’s like a blueprint for an API, telling developers what the API can do, what data it expects, and what it will return.

Key components of an OpenAPI spec include:

Paths: The different endpoints of your API
Methods: What actions can be performed (GET, POST, etc.)
Parameters: What data the API expects
Responses: What the API will return

Here’s a simple example of an OpenAPI specification:

openapi: 3.0.0
info:
  title: Sample API
  version: 1.0.0
paths:
  /users:
    get:
      summary: Get all users
      responses:
        '200':
          description: Successful response
    post:
      summary: Create a user
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                name:
                  type: string
                email:
                  type: string
      responses:
        '201':
          description: User created

Function Calling in Language Models #

Function calling is a feature in advanced language models like GPT-4. However, it’s important to understand what this feature actually does and what it doesn’t do.

What Function Calling Is #

Function calling in language models is essentially a way to generate structured outputs. When we say the model can “call functions,” what we really mean is:

The model is given information about one or more functions, including their names, parameters, and descriptions.
When prompted, the model can generate an output that specifies:
- Which function should be called
- What arguments should be passed to that function

This output is formatted in a specific, structured way that makes it easy for a program to parse and use.

What Function Calling Isn’t #

It’s crucial to understand that the language model does not actually execute any functions. The model merely suggests a function call and its arguments. The actual execution of the function must be done by the application using the model’s output.

How It Works #

Here’s a step-by-step breakdown of how function calling works:

Function Definition: You define one or more functions that the model can “call”. These definitions include the function name, parameters, and a description of what the function does.
Model Input: When you send a prompt to the model, you also send these function definitions.
Model Output: The model analyzes the prompt and the function definitions. If it determines that calling a function would be appropriate, it generates an output specifying which function to call and with what arguments.
Parsing the Output: Your application receives this structured output and parses it to determine which function the model suggested and what arguments it proposed.
Actual Execution: If appropriate, your application then actually calls the real function with the suggested arguments.

Example #

Let’s say we define a function like this:

{
  "name": "get_weather",
  "description": "Get the current weather in a given location",
  "parameters": {
    "type": "object",
    "properties": {
      "location": {
        "type": "string",
        "description": "The city and state, e.g. San Francisco, CA"
      },
      "unit": {
        "type": "string",
        "enum": ["celsius", "fahrenheit"]
      }
    },
    "required": ["location"]
  }
}

If we then ask the model “What’s the weather like in New York?”, it might generate an output like this:

{
  "function": "get_weather",
  "arguments": {
    "location": "New York, NY",
    "unit": "fahrenheit"
  }
}

This output doesn’t actually get the weather - it just suggests calling a function that would get the weather. Your application would then need to have an actual get_weather function that it calls with these arguments to retrieve the real weather data.

Why It’s Useful #

Despite not actually executing functions, this feature is extremely powerful because it allows the model to:

Understand and respond to user intents in a structured way
Interface with external systems and APIs through your application
Provide consistent, parseable outputs for complex queries

By combining this with OpenAPI specifications, we can create systems where AI models can effectively “use” APIs based on their documentation, opening up a world of possibilities for AI-driven automation and interaction.

The Magic: Converting OpenAPI to Functions #

Now, here’s where it gets exciting. We can take an OpenAPI specification and convert it into a format that a language model can understand for function calling. This means the AI can effectively “use” the API!

Core Idea #

Basic Implementation #

Let’s break down the logic:

Create an OpenAPIToFunctionConverter class: This class takes an OpenAPI spec and converts it to a list of functions the AI can understand.

class OpenAPIToFunctionConverter:
    def __init__(self, spec: Dict[str, Any]):
        self.spec = spec

    def convert_to_functions(self) -> List[Dict[str, Any]]:
        functions = []
        paths = self.spec.get('paths', {})

        for path, methods in paths.items():
            for method, details in methods.items():
                function = self._create_function(path, method, details)
                functions.append(function)

        return functions

    def _create_function(self, path: str, method: str, details: Dict[str, Any]) -> Dict[str, Any]:
        function = {
            "name": details.get('operationId') or f"{method}_{self._path_to_name(path)}",
            "description": details.get('summary', details.get('description', 'No description provided')),
            "parameters": {
                "type": "object",
                "properties": {},
                "required": []
            }
        }

        # Handle path parameters
        path_params = [p for p in details.get('parameters', []) if p['in'] == 'path']
        for param in path_params:
            self._add_parameter(function, param)

        # Handle query parameters
        query_params = [p for p in details.get('parameters', []) if p['in'] == 'query']
        for param in query_params:
            self._add_parameter(function, param)

        # Handle request body
        if 'requestBody' in details:
            content = details['requestBody'].get('content', {})
            schema = next(iter(content.values())).get('schema', {}) if content else {}
            if schema.get('type') == 'object':
                for prop_name, prop_details in schema.get('properties', {}).items():
                    self._add_parameter(function, {
                        'name': prop_name,
                        'schema': prop_details,
                        'required': prop_name in schema.get('required', [])
                    })

        return function

    def _handle_schema(self, schema: Dict[str, Any]) -> Dict[str, Any]:
        if 'allOf' in schema:
            # Combine all subschemas
            return self._combine_schemas(schema['allOf'])
        elif 'anyOf' in schema or 'oneOf' in schema:
            # Use the first schema as an example
            return self._handle_schema(schema.get('anyOf', schema.get('oneOf'))[0])
        else:
            return schema

    def _combine_schemas(self, schemas: List[Dict[str, Any]]) -> Dict[str, Any]:
        combined = {}
        for schema in schemas:
            combined.update(self._handle_schema(schema))
        return combined

    def _add_parameter(self, function: Dict[str, Any], param: Dict[str, Any]):
        param_schema = self._handle_schema(param.get('schema', {}))
        function["parameters"]["properties"][param['name']] = {
            "type": param_schema.get('type', 'string'),
            "description": param.get('description', 'No description provided')
        }
        if param_schema.get('enum'):
            function["parameters"]["properties"][param['name']]["enum"] = param_schema['enum']
        if param.get('required', False):
            function["parameters"]["required"].append(param['name'])

    @staticmethod
    def _path_to_name(path: str) -> str:
        return path.replace('/', '_').strip('_').replace('{', '').replace('}', '')

    def get_functions(self) -> List[Dict[str, Any]]:
        return self.convert_to_functions()

# Usage
with open("api_spec.yaml") as f:
    raw_openapi_spec = yaml.safe_load(f)

converter = OpenAPIToFunctionConverter(raw_openapi_spec)
functions = converter.get_functions()
print(functions)

Output:

[
    {
        "name": "get_users",
        "description": "Get all users",
        "parameters": {
            "type": "object",
            "properties": {},
            "required": []
        }
    },
    {
        "name": "post_users",
        "description": "Create a user",
        "parameters": {
            "type": "object",
            "properties": {
                "name": {
                    "type": "string",
                    "description": "User's name"
                },
                "email": {
                    "type": "string",
                    "description": "User's email"
                }
            },
            "required": ["name", "email"]
        }
    }
]

Create an LLMFunctionCaller class: This class uses the OpenAI API to call the language model with the converted functions.

import os
import openai
from dotenv import load_dotenv

load_dotenv()

class LLMFunctionCaller:
    def __init__(self, api_key: str, model: str = "gpt-4o-mini"):
        openai.api_key = api_key
        self.model = model

    def call_llm_with_functions(self, prompt: str, functions: List[Dict[str, Any]]):
        try:
            response = openai.chat.completions.create(
                model=self.model,
                messages=[{"role": "user", "content": prompt}],
                functions=functions,
                function_call="auto"
            )
            return response.choices[0].message
        except Exception as e:
            print(f"An error occurred: {e}")
            return None

# Usage
llm_caller = LLMFunctionCaller(os.getenv('OPENAI_API_KEY'))
prompt = "Get information about all users"
response = llm_caller.call_llm_with_functions(prompt, functions)
print("LLM Response:")
print(response)

Output:

LLM Response:
{
  "function_call": {
    "name": "get_users",
    "arguments": "{}"
  },
  "content": null
}

Implementation using Langchain #

Langchain simplifies this process:

import yaml
from langchain.agents.agent_toolkits.openapi import planner
from langchain_community.utilities import RequestsWrapper
from langchain_openai import ChatOpenAI

from dotenv import load_dotenv
from langchain_community.agent_toolkits.openapi.spec import reduce_openapi_spec

load_dotenv()

ALLOW_DANGEROUS_REQUEST = True

# Load the OpenAPI spec
with open("api_spec.yaml") as f:
    raw_openapi_spec = yaml.safe_load(f)

# Reduce the OpenAPI spec dict
reduced_spec = reduce_openapi_spec(raw_openapi_spec)

# Create the agent
agent = planner.create_openapi_agent(
    openai_api_spec,
    RequestsWrapper(headers={}),
    llm,
    allow_dangerous_requests=ALLOW_DANGEROUS_REQUEST,
)

# Use the agent
agent.invoke("Get information about all users")

Output:

> Entering new AgentExecutor chain...
I need to use the GET /users endpoint to retrieve information about all users.

Action: get_users
Action Input: {}

> Entering new RequestsChain chain...
Entering GET /users with:
headers: {}
params: {}
data: {}
Response: {"status": "success", "data": [{"id": 1, "name": "John Doe", "email": "john@example.com"}, {"id": 2, "name": "Jane Smith", "email": "jane@example.com"}]}

> Finished chain.

The GET /users endpoint returned information about all users. Here's a summary of the data:

1. User ID: 1
   Name: John Doe
   Email: john@example.com

2. User ID: 2
   Name: Jane Smith
   Email: jane@example.com

Is there anything specific you'd like to know about these users?

> Finished chain.

Practical Use Cases #

This OpenAPI to function calling approach has numerous applications:

Chatbots for API Interaction: Create chatbots that can interact with complex APIs using natural language.
Automated Testing: Generate test cases for APIs based on their specifications.
API Exploration: Allow developers to explore and understand APIs through natural language queries.
Workflow Automation: Create complex workflows that interact with multiple APIs using AI-driven decision-making.

Conclusion #

The combination of OpenAPI specifications and function calling in language models opens up exciting possibilities for creating more intelligent and user-friendly systems. Whether you’re building a chatbot, automating workflows, or exploring APIs, this approach can significantly simplify your development process and enhance the capabilities of your applications.

Start exploring the powerful world of AI-driven API interactions!

←

Defensive Distillation

K-Armed Bandit Problem: From Casinos to Wall Street

→