Python Quick Start Guide for /v1/chat/completions API

Table of Contents [expand]

Prerequisites
Python Example Code

Last updated September 04, 2025

Chat models generate conversational completions for input messages. This guide describes how to use the v1-chat-completions API with Python.

Prerequisites

Before making requests, provision access to the model of your choice.

If it’s not already installed, install the Heroku CLI. Then install the Heroku AI plugin:
```
heroku plugins:install @heroku/plugin-ai
```

Attach a chat model to an app of yours:

# If you don't have an app yet, you can create one with:
heroku create $APP_NAME # specify the name you want for your app (or skip this step to use an existing app you have)

# Create and attach one of our chat models to your app, $APP_NAME:
heroku ai:models:create -a $APP_NAME claude-4-sonnet --as INFERENCE

Install the necessary requests package:
```
pip install requests
```

Python Example Code

import requests
import json
import os

# Global variables for API endpoint, authorization key, and model ID from Heroku config variables
ENV_VARS = {
    "INFERENCE_URL": None,
    "INFERENCE_KEY": None,
    "INFERENCE_MODEL_ID": None
}

# Assert the existence of required environment variables, with helpful messages if they're missing.
for env_var in ENV_VARS.keys():
    value = os.environ.get(env_var)
    assert value is not None, (
        f"Environment variable '{env_var}' is missing. Set it using:\n"
        f"export {env_var}=$(heroku config:get -a $APP_NAME {env_var})"
    )
    ENV_VARS[env_var] = value


def parse_chat_output(response):
    """
    Parses and prints the API response for the chat completion request.

    Parameters:
        - response (requests.Response): The response object from the API call.
    """
    if response.status_code == 200:
        result = response.json()
        print("Chat Completion:", result["choices"][0]["message"]["content"])
    else:
        print(f"Request failed: {response.status_code}, {response.text}")

def generate_chat_completion(payload):
    """
    Generates a chat completion using the Stability AI Chat Model.

    Parameters:
        - payload (dict): dictionary containing parameters for the chat completion request

    Returns:
        - Prints the generated chat completion.
    """
    # Set headers using the global API key
    HEADERS = {
        "Authorization": f"Bearer {ENV_VARS['INFERENCE_KEY']}",
        "Content-Type": "application/json"
    }
    endpoint_url = ENV_VARS['INFERENCE_URL'] + "/v1/chat/completions"
    response = requests.post(endpoint_url, headers=HEADERS, data=json.dumps(payload))

    parse_chat_output(response=response)


# Example payload
payload = {
    "model": ENV_VARS["INFERENCE_MODEL_ID"],
    "messages": [
        { "role": "user", "content": "Hello!" },
        { "role": "assistant", "content": "Hi there! How can I assist you today?" },
        { "role": "user", "content": "Why is Heroku so cool?"}
    ],
    "temperature": 0.5,
    "max_tokens": 100,
    "stream": False
}

# Generate a chat completion with the given payload
generate_chat_completion(payload)

Keep Reading

Or explore the Heroku Inference Quick Start Guides category.

Categories

Python Quick Start Guide for /v1/chat/completions API

Table of Contents [expand]

Prerequisites

Python Example Code

Keep Reading

Feedback