Create Chat Completion

POST /v1/chat/completions

Creates a model response for the given chat conversation. Supports streaming, function calling, and multiple AI providers through a unified interface.

Request Body

model

string

required

ID of the model to use (e.g., gpt-4, claude-3-5-sonnet, gemini-2.0-flash)

messages

array

required

A list of messages comprising the conversation so far.

Show Message object

role

string

required

The role of the message author. One of system, user, assistant, tool, or function.

content

string | array

The contents of the message. Can be a string or array of content parts for multimodal input.

name

string

An optional name for the participant.

tool_calls

array

Tool calls generated by the model (for assistant messages).

tool_call_id

string

The ID of the tool call this message is responding to (for tool messages).

stream

boolean

default:"false"

If set to true, partial message deltas will be sent as server-sent events.

stream_options

object

Options for streaming responses.

Show properties

include_usage

boolean

If set, includes usage statistics in the final streamed chunk.

temperature

number

default:"1"

Sampling temperature between 0 and 2. Higher values make output more random.

max_tokens

integer

Maximum number of tokens to generate in the completion.

max_completion_tokens

integer

An upper bound for the number of tokens that can be generated.

stop

string | array

Up to 4 sequences where the API will stop generating tokens.

presence_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on presence in text.

frequency_penalty

number

default:"0"

Number between -2.0 and 2.0. Positive values penalize new tokens based on frequency in text.

tools

array

A list of tools the model may call.

Show Tool object

type

string

required

The type of tool. Currently only function is supported.

function

object

required

The function definition including name, description, and parameters schema.

tool_choice

string | object

Controls which tool is called. auto lets the model decide, none prevents tool calls, or specify a tool.

parallel_tool_calls

boolean

default:"true"

Whether to enable parallel function calling during tool use.

response_format

object

Specify the output format.

Show properties

type

string

Either text (default) or json_object for JSON mode.

reasoning_effort

string

For reasoning models, controls the effort level: low, medium, or high.

Response

string

A unique identifier for the chat completion.

object

string

The object type, always chat.completion.

created

integer

Unix timestamp of when the completion was created.

model

string

The model used for completion.

choices

array

A list of chat completion choices.

Show Choice object

index

integer

The index of this choice.

message

object

The generated message.

finish_reason

string

The reason the model stopped generating: stop, length, tool_calls, etc.

usage

object

Token usage statistics.

Show properties

prompt_tokens

integer

Tokens in the prompt.

completion_tokens

integer

Tokens in the completion.

total_tokens

integer

Total tokens used.

Examples

Basic Completion

from openai import OpenAI

client = OpenAI(
    api_key="sk-voidai-your_key_here",
    base_url="https://api.voidai.app/v1"
)

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "What is the capital of France?"}
    ]
)

print(response.choices[0].message.content)

Streaming

stream = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "Tell me a short story"}],
    stream=True
)

for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Function Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_weather",
            "description": "Get the current weather in a location",
            "parameters": {
                "type": "object",
                "properties": {
                    "location": {
                        "type": "string",
                        "description": "The city name"
                    },
                    "unit": {
                        "type": "string",
                        "enum": ["celsius", "fahrenheit"]
                    }
                },
                "required": ["location"]
            }
        }
    }
]

response = client.chat.completions.create(
    model="gpt-4",
    messages=[{"role": "user", "content": "What's the weather in Tokyo?"}],
    tools=tools,
    tool_choice="auto"
)

# Check if the model wants to call a function
if response.choices[0].message.tool_calls:
    tool_call = response.choices[0].message.tool_calls[0]
    print(f"Function: {tool_call.function.name}")
    print(f"Arguments: {tool_call.function.arguments}")

Vision (Multimodal)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[
        {
            "role": "user",
            "content": [
                {"type": "text", "text": "What's in this image?"},
                {
                    "type": "image_url",
                    "image_url": {
                        "url": "https://example.com/image.jpg",
                        "detail": "high"
                    }
                }
            ]
        }
    ]
)

print(response.choices[0].message.content)

JSON Mode

response = client.chat.completions.create(
    model="gpt-4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant that responds in JSON."},
        {"role": "user", "content": "List 3 programming languages with their year of creation"}
    ],
    response_format={"type": "json_object"}
)

import json
data = json.loads(response.choices[0].message.content)
print(data)

Response Example

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1701691200,
  "model": "gpt-4",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "The capital of France is Paris."
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 25,
    "completion_tokens": 8,
    "total_tokens": 33
  }
}

Streaming Response

When stream: true, responses are sent as server-sent events:

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"The"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" capital"},"finish_reason":null}]}

data: [DONE]

Chat

Images

Audio

Video

Embeddings

Moderations

Models

Discounts

Create Chat Completion

Request Body

Response

Examples

Basic Completion

Streaming

Function Calling

Vision (Multimodal)

JSON Mode

Response Example

Streaming Response

Chat

Images

Audio

Video

Embeddings

Moderations

Models

Discounts

​Request Body

​Response

​Examples

​Basic Completion

​Streaming

​Function Calling

​Vision (Multimodal)

​JSON Mode

​Response Example

​Streaming Response

Request Body

Response

Examples

Basic Completion

Streaming

Function Calling

Vision (Multimodal)

JSON Mode

Response Example

Streaming Response