Provider capabilities
The table below summarizes gateway support for this endpoint by provider.Legend:
- ✅ Supported by Provider and Truefoundry
- Provided by provider, but not by Truefoundry
- Provider does not support this feature
| Provider | Compaction API |
|---|---|
| OpenAI | ✅ |
POST /responses/compact and server-side compaction via context_management in POST /responses.
Compaction is only supported by OpenAI. Use the gateway’s OpenAI inference base URL.
Standalone: POST /responses/compact
Send a full context window; the API returns a compacted window (including an opaque, encrypted compaction item) to pass as input to your next /responses call. Body: model, input, and optionally instructions, previous_response_id.
Code
Response shape
Server-side: POST /responses with context_management
Set context_management: [{"type": "compaction", "compact_threshold": 200000}] on create. When the rendered token count crosses the threshold, the server compacts and emits a compaction item in the stream. No separate /responses/compact call needed.
- Stateless: Append response output (including compaction items) to your input each turn.
- Stateful: Use
previous_response_idand send only the new user message; do not manually prune.