Model Responses

curl --request POST \ --url https://{gatewayBaseURL}/responses \ --header 'Authorization: Bearer <token>' \ --header 'Content-Type: application/json' \ --data ' { "model": "<string>", "input": "<unknown>", "background": true, "include": [ "<string>" ], "instructions": "<string>", "max_output_tokens": 123, "metadata": {}, "parallel_tool_calls": true, "previous_response_id": "<string>", "reasoning": { "effort": "<string>" }, "store": true, "stream": true, "temperature": 123, "text": { "format": {} }, "tools": [ "<unknown>" ], "top_p": 123, "user": "<string>" } '

{ "id": "<string>", "object": "<string>", "created_at": 123, "status": "<string>", "max_output_tokens": 123, "model": "<string>", "output": [ { "id": "<string>", "type": "<string>", "status": "<string>", "content": [ { "type": "<string>", "annotations": [ "<unknown>" ], "text": "<string>" } ], "role": "<string>" } ], "parallel_tool_calls": true, "previous_response_id": "<string>", "reasoning": { "effort": "<unknown>", "summary": "<unknown>" }, "service_tier": "<string>", "store": true, "temperature": 123, "text": { "format": { "type": "<string>" } }, "tool_choice": "<string>", "tools": [ "<unknown>" ], "top_p": 123, "truncation": "<string>", "usage": { "input_tokens": 123, "input_tokens_details": { "cached_tokens": 123 }, "output_tokens": 123, "output_tokens_details": { "reasoning_tokens": 123 }, "total_tokens": 123 }, "metadata": {}, "provider": "<string>", "error": "<unknown>", "incomplete_details": "<unknown>", "instructions": "<unknown>", "user": "<unknown>" }

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-tfy-metadata

string

Optional metadata for the request

Body

application/json

Parameters for generating model responses.

model

string

required

Model identifier to generate the response

input

any

Text, image, or file inputs to the model

background

boolean | null

Whether to run the model response in the background

include

string[] | null

Additional output data to include

instructions

string | null

System message as first item in context

max_output_tokens

number | null

Upper bound for tokens generated

metadata

object

Key-value pairs for additional information

Show child attributes

parallel_tool_calls

boolean | null

Allow parallel tool calls

previous_response_id

string | null

ID of previous response for multi-turn

reasoning

object

Configuration for reasoning models

Show child attributes

service_tier

enum<string> | null

Latency tier for processing

Available options:

auto,

default,

flex,

priority

store

boolean | null

Whether to store the response

stream

boolean | null

Enable streaming response

temperature

number | null

Sampling temperature between 0 and 2

text

object

Text response configuration

Show child attributes

tool_choice

Tool selection behavior

Available options:

none

tools

any[] | null

Available tools for the model

top_p

number | null

Nucleus sampling parameter

truncation

enum<string> | null

Truncation strategy

Available options:

auto,

disabled

user

string | null

End-user identifier

Response

Model Response generated successfully.

string

required

Response ID.

object

string

required

Object type.

created_at

number

required

Creation timestamp.

status

string

required

Response status.

max_output_tokens

number | null

required

Maximum output tokens allowed.

model

string

required

Model used for the response.

output

object[]

required

Show child attributes

parallel_tool_calls

boolean

required

Indicates if parallel tool calls were used.

previous_response_id

string | null

required

ID of the previous response, if any.

reasoning

object

required

Reasoning details.

Show child attributes

service_tier

string

required

Service tier.

store

boolean

required

Indicates if the response is stored.

temperature

number

required

Temperature setting for the model.

text

object

required

Show child attributes

tool_choice

string

required

Tool choice used.

tools

any[]

required

Tools used in the response.

top_p

number

required

Top-p sampling parameter.

truncation

string

required

Truncation setting.

usage

object

required

Show child attributes

metadata

object

required

Additional metadata.

provider

string

required

Provider of the response.

error

any

Provider error object when the response failed.

incomplete_details

any

Reason the response was cut off, e.g. max_output_tokens or content_filter.

instructions

any

Instructions provided for the response.

user

any

End-user identifier echoed from the request.

Get Started

LLM Gateway

MCP Registry and Gateway

Agent Registry

Skills Registry

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

Chat

Agent

Messages

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Fine-tuning

Moderations

Models

Authorizations

Headers

Body

Response

Get Started

LLM Gateway

MCP Registry and Gateway

Agent Registry

Skills Registry

Guardrails and Security

Prompt Management

Observability

Deployment

Admin Guide

Chat

Agent

Messages

Embeddings

Rerank

Responses

Image

Audio

Batch

Files

Fine-tuning

Moderations

Models

Documentation Index

Authorizations

Headers

Body

Response