Table of Contents [expand]
Last updated September 04, 2025
Chat models generate conversational completions for input messages. This guide describes how to use the v1-chat-completions API with Python.
Prerequisites
Before making requests, provision access to the model of your choice.
- If it’s not already installed, install the Heroku CLI. Then install the Heroku AI plugin: - heroku plugins:install @heroku/plugin-ai
- Attach a chat model to an app of yours: - # If you don't have an app yet, you can create one with: heroku create $APP_NAME # specify the name you want for your app (or skip this step to use an existing app you have) # Create and attach one of our chat models to your app, $APP_NAME: heroku ai:models:create -a $APP_NAME claude-4-sonnet --as INFERENCE
- Install the necessary - requestspackage:- pip install requests
Python Example Code
import requests
import json
import os
# Global variables for API endpoint, authorization key, and model ID from Heroku config variables
ENV_VARS = {
    "INFERENCE_URL": None,
    "INFERENCE_KEY": None,
    "INFERENCE_MODEL_ID": None
}
# Assert the existence of required environment variables, with helpful messages if they're missing.
for env_var in ENV_VARS.keys():
    value = os.environ.get(env_var)
    assert value is not None, (
        f"Environment variable '{env_var}' is missing. Set it using:\n"
        f"export {env_var}=$(heroku config:get -a $APP_NAME {env_var})"
    )
    ENV_VARS[env_var] = value
def parse_chat_output(response):
    """
    Parses and prints the API response for the chat completion request.
    Parameters:
        - response (requests.Response): The response object from the API call.
    """
    if response.status_code == 200:
        result = response.json()
        print("Chat Completion:", result["choices"][0]["message"]["content"])
    else:
        print(f"Request failed: {response.status_code}, {response.text}")
def generate_chat_completion(payload):
    """
    Generates a chat completion using the Stability AI Chat Model.
    Parameters:
        - payload (dict): dictionary containing parameters for the chat completion request
    Returns:
        - Prints the generated chat completion.
    """
    # Set headers using the global API key
    HEADERS = {
        "Authorization": f"Bearer {ENV_VARS['INFERENCE_KEY']}",
        "Content-Type": "application/json"
    }
    endpoint_url = ENV_VARS['INFERENCE_URL'] + "/v1/chat/completions"
    response = requests.post(endpoint_url, headers=HEADERS, data=json.dumps(payload))
    parse_chat_output(response=response)
# Example payload
payload = {
    "model": ENV_VARS["INFERENCE_MODEL_ID"],
    "messages": [
        { "role": "user", "content": "Hello!" },
        { "role": "assistant", "content": "Hi there! How can I assist you today?" },
        { "role": "user", "content": "Why is Heroku so cool?"}
    ],
    "temperature": 0.5,
    "max_tokens": 100,
    "stream": False
}
# Generate a chat completion with the given payload
generate_chat_completion(payload)