Skip to main content
POST
/
responses
Model Responses
curl --request POST \
  --url https://{gatewayBaseURL}/responses \
  --header 'Authorization: Bearer <token>' \
  --header 'Content-Type: application/json' \
  --data '
{
  "model": "<string>",
  "input": "<unknown>",
  "background": true,
  "include": [
    "<string>"
  ],
  "instructions": "<string>",
  "max_output_tokens": 123,
  "metadata": {},
  "parallel_tool_calls": true,
  "previous_response_id": "<string>",
  "reasoning": {
    "effort": "<string>"
  },
  "store": true,
  "stream": true,
  "temperature": 123,
  "text": {
    "format": {}
  },
  "tools": [
    "<unknown>"
  ],
  "top_p": 123,
  "user": "<string>"
}
'
{
  "id": "<string>",
  "object": "<string>",
  "created_at": 123,
  "status": "<string>",
  "max_output_tokens": 123,
  "model": "<string>",
  "output": [
    {
      "id": "<string>",
      "type": "<string>",
      "status": "<string>",
      "content": [
        {
          "type": "<string>",
          "annotations": [
            "<unknown>"
          ],
          "text": "<string>"
        }
      ],
      "role": "<string>"
    }
  ],
  "parallel_tool_calls": true,
  "previous_response_id": "<string>",
  "reasoning": {
    "effort": "<unknown>",
    "summary": "<unknown>"
  },
  "service_tier": "<string>",
  "store": true,
  "temperature": 123,
  "text": {
    "format": {
      "type": "<string>"
    }
  },
  "tool_choice": "<string>",
  "tools": [
    "<unknown>"
  ],
  "top_p": 123,
  "truncation": "<string>",
  "usage": {
    "input_tokens": 123,
    "input_tokens_details": {
      "cached_tokens": 123
    },
    "output_tokens": 123,
    "output_tokens_details": {
      "reasoning_tokens": 123
    },
    "total_tokens": 123
  },
  "metadata": {},
  "provider": "<string>",
  "error": "<unknown>",
  "incomplete_details": "<unknown>",
  "instructions": "<unknown>",
  "user": "<unknown>"
}

Documentation Index

Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt

Use this file to discover all available pages before exploring further.

Authorizations

Authorization
string
header
required

Bearer authentication header of the form Bearer <token>, where <token> is your auth token.

Headers

x-tfy-metadata
string

Optional metadata for the request

Body

application/json

Parameters for generating model responses.

model
string
required

Model identifier to generate the response

input
any

Text, image, or file inputs to the model

background
boolean | null

Whether to run the model response in the background

include
string[] | null

Additional output data to include

instructions
string | null

System message as first item in context

max_output_tokens
number | null

Upper bound for tokens generated

metadata
object

Key-value pairs for additional information

parallel_tool_calls
boolean | null

Allow parallel tool calls

previous_response_id
string | null

ID of previous response for multi-turn

reasoning
object

Configuration for reasoning models

service_tier
enum<string> | null

Latency tier for processing

Available options:
auto,
default,
flex,
priority
store
boolean | null

Whether to store the response

stream
boolean | null

Enable streaming response

temperature
number | null

Sampling temperature between 0 and 2

text
object

Text response configuration

tool_choice

Tool selection behavior

Available options:
none
tools
any[] | null

Available tools for the model

top_p
number | null

Nucleus sampling parameter

truncation
enum<string> | null

Truncation strategy

Available options:
auto,
disabled
user
string | null

End-user identifier

Response

Model Response generated successfully.

id
string
required

Response ID.

object
string
required

Object type.

created_at
number
required

Creation timestamp.

status
string
required

Response status.

max_output_tokens
number | null
required

Maximum output tokens allowed.

model
string
required

Model used for the response.

output
object[]
required
parallel_tool_calls
boolean
required

Indicates if parallel tool calls were used.

previous_response_id
string | null
required

ID of the previous response, if any.

reasoning
object
required

Reasoning details.

service_tier
string
required

Service tier.

store
boolean
required

Indicates if the response is stored.

temperature
number
required

Temperature setting for the model.

text
object
required
tool_choice
string
required

Tool choice used.

tools
any[]
required

Tools used in the response.

top_p
number
required

Top-p sampling parameter.

truncation
string
required

Truncation setting.

usage
object
required
metadata
object
required

Additional metadata.

provider
string
required

Provider of the response.

error
any

Provider error object when the response failed.

incomplete_details
any

Reason the response was cut off, e.g. max_output_tokens or content_filter.

instructions
any

Instructions provided for the response.

user
any

End-user identifier echoed from the request.