This section explains the steps to add AWS Bedrock models and configure the required access controls.
1
Navigate to AWS Bedrock Models in AI Gateway
From the TrueFoundry dashboard, navigate to AI Gateway > Models and select AWS Bedrock.
2
Add AWS Bedrock Account Name and Collaborators
Give a unique name for the bedrock account which will be used to refer later in the models. The models in the account will be referred to as @providername/@modelname. Add collaborators to your account. You can decide which users/teams have access to the models in the account (User Role) and who can add/edit/remove models in this account (Manager Role). You can read more about access control here.
3
Add Region and Authentication
Select the default AWS region for the models in this account. The account-level region serves as the default for all models unless explicitly overridden at the model level. Provide the authentication details on how the gateway can access the Bedrock models. Truefoundry supports AWS Access Key/Secret Key, Assume Role, and API Key based authentication. You can read below on how to generate the access/secret keys, roles, or API keys.
Get AWS Authentication Details
Required IAM PolicyFirst, create the IAM policy that grants permission to invoke Bedrock models. This policy can be attached to an IAM user (for access key or API key authentication) or an IAM role (for assumed role authentication).The following policy grants permission to invoke any foundation model and resolve inference profiles in your available regions (To check the list of available regions for different models, refer to AWS Bedrock):
bedrock:GetInferenceProfile is required when invoking models through an inference profile (system-defined or custom). The gateway uses it to resolve the profile to its underlying foundation model.
Use this access key and secret while adding the provider account to authenticate requests to the Bedrock model.
Using Assumed RoleThe gateway role assumes your role, which in turn accesses Bedrock models.
Create an IAM role in your AWS account that has access to Bedrock. Attach the IAM policy with Bedrock permissions (shown above) to this role.
Configure the trust policy for this role to allow the gateway role to assume it. Use the appropriate role ARN based on your deployment:
For SAAS deployments:
Gateway role ARN: arn:aws:iam::416964291864:role/tfy-ctl-production-ai-gateway-deps
For on-prem deployments:
Your gateway role ARN will look like: arn:aws:iam::<your-aws-account-id>:role/<account-prefix>-truefoundry-deps
{ "Version": "2012-10-17", "Statement": [ { "Sid": "Statement1", "Effect": "Allow", "Principal": { // for SAAS deployments: "AWS": "arn:aws:iam::416964291864:role/tfy-ctl-production-ai-gateway-deps" // or for on-prem deployments: // "AWS": "arn:aws:iam::<your-aws-account-id>:role/<account-prefix>-truefoundry-deps" }, "Action": "sts:AssumeRole", // (Optional) For additional security use external ID. "Condition": { "StringEquals": { "sts:ExternalId": "your-external-id" } } } ]}
Replace the Principal AWS ARN in the trust policy with the appropriate gateway role ARN based on your deployment type (SAAS or on-prem).
You can optionally configure an external ID in the trust policy (as shown in the example above) for additional security. If you use an external ID, make sure to provide the same external ID when creating the Bedrock model integration in TrueFoundry.
Using AWS Bedrock API KeyAWS Bedrock API keys provide a simpler authentication method using Bearer token authentication. This method is ideal for exploration and development use cases.
Once your Bedrock provider account is configured, the following API surfaces are available through the gateway. The table below summarizes each endpoint alongside platform feature support (tracing, cost tracking).
Not supported for Bedrock: Messages API (Anthropic-only), Responses API, Text-to-Speech, Speech-to-Text, Audio Translation, Moderation, Fine-tuning, Image Variation, and Realtime API. Bedrock has no upstream
for these surfaces. See the OpenAI and Anthropic provider docs if you need them.
Chat Completions
Bedrock’s chat completions endpoint is the most widely used — it supports streaming, tools, multimodal input (images, PDF), structured JSON outputs, prompt caching, extended
thinking, and multi-family model swapping. The gateway translates OpenAI-compatible requests into Bedrock’s native Converse or InvokeModel API based on the model family.
Full provider capability matrix: Chat Completions API.
Python
import osfrom openai import OpenAIclient = OpenAI( api_key="your-truefoundry-api-key", base_url="{GATEWAY_BASE_URL}",)response = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=[ {"role": "user", "content": "What is TrueFoundry in one line?"}, ],)print(response.choices[0].message.content)
Streaming
Set stream=True and iterate over delta chunks. Defensively check that chunk.choices is non-empty and delta.content is not None.
Python
stream = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=[{"role": "user", "content": "Count from 1 to 5."}], stream=True,)for chunk in stream: if ( chunk.choices and len(chunk.choices) > 0 and chunk.choices[0].delta.content is not None ): print(chunk.choices[0].delta.content, end="", flush=True)
Function calling / tools
Advertise a tool, hand the model’s tool_calls back as a tool role message, then request the final response.
Every request with toolUse or toolResult blocks in the message history must include tools=... — not just the initial call. OpenAI tolerates its absence on follow-ups;
Bedrock’s Converse API rejects with toolConfig field must be defined when using toolUse and toolResult content blocks.
Python
import jsontools = [{ "type": "function", "function": { "name": "get_weather", "description": "Get the current weather for a city.", "parameters": { "type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"], }, },}]messages = [{"role": "user", "content": "Weather in Bengaluru?"}]first = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=messages, tools=tools, tool_choice={"type": "function", "function": {"name": "get_weather"}},)assistant_msg = first.choices[0].messagetool_calls = assistant_msg.tool_calls or []if tool_calls: tool_call = tool_calls[0] messages.append(assistant_msg) messages.append({ "role": "tool", "tool_call_id": tool_call.id, "content": json.dumps({"city": "Bengaluru", "temp_c": 28}), }) # Note: tools=tools required on the follow-up too second = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=messages, tools=tools, ) print(second.choices[0].message.content)
Vision (multimodal images)
Claude 3+, Nova, and Llama Vision models on Bedrock support image inputs via the image_url content part.
Claude models on Bedrock support PDF documents via the file content type with base64 encoding.
Python
import base64with open("sample.pdf", "rb") as f: pdf_b64 = base64.b64encode(f.read()).decode("ascii")response = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=[{ "role": "user", "content": [ {"type": "text", "text": "What text is in this PDF?"}, { "type": "file", "file": { "filename": "sample.pdf", "file_data": f"data:application/pdf;base64,{pdf_b64}", }, }, ], }],)print(response.choices[0].message.content)
Structured outputs (JSON schema)
Bedrock has no native JSON schema mode — the gateway converts your schema into a required tool call and extracts the result into message.content.
Works across all Bedrock model families.
Anthropic-via-Bedrock inherits direct-Anthropic’s constraint rejection: ge, le, minimum, maximum must be stripped from Pydantic-generated schemas.
For Claude-via-Bedrock, the gateway translates cache_control hints into Bedrock’s native cachePoint format.
Minimum cacheable prefix varies by model: 1024 tokens for Sonnet 4 and earlier, 2048 for Haiku 3.5/3, 4096 for Claude 4.5+ and Opus 4.5+.
Titan models don’t support cache_control on tool definitions (the gateway skips them automatically).
Claude 3.7+, Claude 4, and Claude 4.5 series models on Bedrock support extended thinking. Use reasoning_effort — the gateway translates it into Bedrock’s native thinking parameter at ratios (none=0%,
low=30%, medium=60%, high=90% of max_tokens). Bedrock requires a minimum budget_tokens of 1024.
Python
response = client.chat.completions.create( model="tfy-ai-bedrock/global-anthropic-claude-sonnet-4-5-20250929-v1-0", messages=[{"role": "user", "content": "A bat and ball cost $1.10. The bat costs $1.00 more than the ball. How much is the ball?"}], reasoning_effort="high", max_tokens=8000,)msg = response.choices[0].messageprint("answer:", msg.content)print("reasoning:", getattr(msg, "reasoning_content", None))# thinking_blocks carry signatures for multi-turn continuityfor block in getattr(msg, "thinking_blocks", []) or []: print(" block:", block.get("type"), "signature:", block.get("signature", "")[:30])
Always echo thinking_blocks exactly as returned when continuing a conversation. Blocks with missing or modified signature fields are rejected by Anthropic/Bedrock.
Embeddings
Bedrock exposes embedding models from Amazon (Titan) and Cohere. All use the same OpenAI-compatible /embeddings endpoint.
Full docs: Embed API.
Python
response = client.embeddings.create( model="tfy-ai-bedrock/amazon-titan-embed-text-v2-0", input=[ "TrueFoundry is an AI platform.", "TrueFoundry helps teams deploy LLMs.", ],)print(len(response.data), "vectors of dim", len(response.data[0].embedding))
Bedrock exposes Amazon Nova Canvas, Amazon Titan Image Generator (v1/v2), and Stability AI text-to-image models through /images/generations.
Full docs: Image Generation.
Python
import base64response = client.images.generate( model="tfy-ai-bedrock/amazon-nova-canvas-v1-0", prompt="A minimalist isometric illustration of a cloud with a lightning bolt, flat colors.", size="1024x1024", n=1,)item = response.data[0]if getattr(item, "b64_json", None): image_bytes = base64.b64decode(item.b64_json)else: import requests image_bytes = requests.get(item.url, timeout=60).contentwith open("generated.png", "wb") as f: f.write(image_bytes)
Supported text-to-image models: amazon.nova-canvas-v1:0, amazon.titan-image-generator-v1, amazon.titan-image-generator-v2:0 + Stability AI models
Stability Image Services (inpaint, outpaint, search-replace, recolor, remove-background, style-guide, style-transfer, upscale, control-structure, control-sketch) are not text-to-image — they
require an input image and won’t work with client.images.generate(). Use them with client.images.edit() or specialized endpoints.
Image Edit
Only Amazon Nova Canvas supports client.images.edit on Bedrock. Titan v2 supports inpaint/outpaint via its own tool-specific model IDs.
Python
with open("generated.png", "rb") as image_file: response = client.images.edit( model="tfy-ai-bedrock/amazon-nova-canvas-v1-0", image=image_file, prompt="Add a bright yellow sun in the top-right corner.", size="1024x1024", n=1, )item = response.data[0]if getattr(item, "b64_json", None): image_bytes = base64.b64decode(item.b64_json)else: import requests image_bytes = requests.get(item.url, timeout=60).contentwith open("generated.png", "wb") as f: f.write(image_bytes)
Batch API
Bedrock batch is S3-backed — the gateway uploads JSONL to an S3 bucket configured on your provider account, creates a Bedrock ModelInvocationJob, and serves aggregated results via a gateway REST endpoint.
Full docs:Batch Predictions.
Bedrock batch prerequisites:
S3 bucket — must be in the same region as your Bedrock provider account
IAM execution role — with S3 R/W + bedrock:InvokeModel
iam:PassRole on the execution role, granted to the principal that invokes the batch API
100 records minimum per batch (AWS-enforced; the gateway does not relax this)
from openai import OpenAIbatch_client = OpenAI( api_key="your-truefoundry-api-key", base_url="{GATEWAY_BASE_URL}", default_headers={ "x-tfy-provider-name": "tfy-ai-bedrock", "x-tfy-aws-s3-bucket": "your-s3-bucket-name", "x-tfy-aws-bedrock-model": "anthropic.claude-3-haiku-20240307-v1:0", # bare AWS ID },)
Build and upload the input JSONL
Python
import json, uuidbatch_requests = [ { "custom_id": f"req-{i}", "method": "POST", "url": "/v1/chat/completions", "body": { "model": "anthropic.claude-3-haiku-20240307-v1:0", "messages": [{"role": "user", "content": f"Say hello in prompt {i}."}], "max_tokens": 50, }, } for i in range(100)]with open("batch_input.jsonl", "w") as f: for req in batch_requests: f.write(json.dumps(req) + "\n")with open("batch_input.jsonl", "rb") as f: uploaded = batch_client.files.create(file=f, purpose="batch")print(uploaded.id) # Example: s3://bucket/uuid.jsonl
2. Create Batch Job
OpenAI’s SDK doesn’t expose Bedrock-specific fields on batches.create, so inject them via extra_body.
Python
batch = batch_client.batches.create( input_file_id=uploaded.id, endpoint="/v1/chat/completions", completion_window="24h", extra_body={ "model": "tfy-ai-bedrock/anthropic-claude-3-haiku-20240307-v1-0", # TF-prefixed "role_arn": "arn:aws:iam::<account>:role/BedrockBatchExecutionRole", "job_name": f"bedrock-batch-{uuid.uuid4().hex[:8]}", # MUST be unique per run },)print(batch.id) # Example: arn:aws:bedrock:us-east-1:ACCOUNT:model-invocation-job/JOB_ID
job_name must be unique per submission — AWS rejects duplicate in-flight names. Always suffix with a UUID or timestamp.
3. Check Batch Status
Poll the batch status until complete. The gateway returns batch.idURL-encoded. Decode once before subsequent retrieve() calls or the SDK double-encodes and Bedrock rejects the ARN.
Bedrock writes outputs across multiple files in S3, so files.content(output_file_id) doesn’t work.
Use the gateway’s Bedrock-specific GET /batches/{id}/output endpoint to fetch aggregated results.
batch.id and output_file_id come back URL-encoded; decode with unquote() before reuse.
batches.create returns a sparse Batch with status=None and most fields empty. Call batches.retrieve() to get real state.
output_file_id is a bucket prefix, not a file. Use GET /batches/{id}/output instead of files.content().
Files API
Bedrock’s Files API uses an S3 backend. Upload and retrieve work; list and delete are not supported because the S3 backend doesn’t expose those operations through the gateway.
Full docs: Files API.
files.list() and files.delete() will error on Bedrock — the S3 backend doesn’t expose them. Plan lifecycle management via S3 bucket policies and lifecycle rules instead of through the gateway.
Bedrock’s Files API only accepts purpose="batch" and validates the file content as batch-style JSONL. Uploading plain text or a non-conforming JSONL will fail validation.
In case you have custom pricing for your models, you can override the default cost by clicking on Edit Model button and then choosing the Private Cost Metric option.
Can I add models from different regions in a single bedrock integration?
Yes, you can add models from different regions. You can provide a top level default region for the account and also override it at the model level.
How to integrate Bedrock cross-region inference model?
What is Cross-Region Inference?Cross-Region Inference dynamically routes your inference requests across multiple AWS regions to optimize performance and handle traffic bursts. Bedrock selects the best region based on load, latency, and availability. Learn more in the AWS Bedrock Cross-Region Inference documentation.Key Difference: Inference Profile ID vs Model IDTo use cross-region inference, you must use an Inference Profile ID instead of a regular model ID. Inference profiles define the foundation model and the AWS regions where requests can be routed.
Regular Model ID: anthropic.claude-3-5-sonnet-20240620-v1:0 (single region)
System-defined geographic profiles: Use geographic prefixes (us., eu., apac.) followed by the model ID (e.g., us.anthropic.claude-3-5-sonnet-20240620-v1:0). The prefix indicates routing within that geography.
Custom inference profiles: Use full ARN format (e.g., arn:aws:bedrock:us-east-1:123456789012:inference-profile/my-profile)
Important: Some models in AWS Bedrock are exclusively accessible through cross-region inference profiles and cannot be invoked directly using their standard foundation model IDs. For these models, you must use the inference profile ID (e.g., us.anthropic.claude-3-7-sonnet-20250219-v1:0) instead of the regular model ID.To identify which models require inference profiles, refer to the Supported Regions and models for inference profiles, which provides a complete list of models and their inference profile availability.
How to Use It1. Add the inference profile ID as a model: When adding a model in TrueFoundry, use the inference profile ID (e.g., us.anthropic.claude-3-5-sonnet-20240620-v1:0) instead of the regular model ID. If it’s not in the dropdown, use + Add Model and enter it manually.
2. Configure IAM permissions for ALL destination regions: This is critical. When Bedrock routes to a different region, your IAM role/access key must have permissions in that region. You must grant permissions for both the inference profile and the foundation model in all destination regions.Update your IAM policy to use * for the region to allow access across all regions:
Replace YOUR-AWS-ACCOUNT-ID in the policy above with your actual AWS account ID. The * in the region position allows access across all regions.
Most Common Mistake: Users grant permissions only in their default region. If Bedrock routes to a different region without permissions, requests will fail with “Access Denied”. Always grant permissions in ALL potential destination regions. For geographic profiles, ensure permissions in both source and destination regions.
3. Check Service Control Policies (SCPs): If your organization uses SCPs to restrict region access, ensure they allow access to all destination regions in your inference profile. Blocking any destination region will prevent cross-region inference from working.Troubleshooting
Request fails with 'Access Denied' error
Cause: Missing IAM permissions in the destination region where Bedrock routed the request.Solution:
Ensure your IAM policy grants Bedrock permissions across all regions (use * in the region part of the ARN)
For geographic profiles, grant permissions in both source and destination regions
Check if Service Control Policies (SCPs) are blocking access to certain regions
Requests always go to the same region
Cause: You’re using a regular model ID instead of an inference profile ID.Solution: Use the inference profile ID format (e.g., us.anthropic.claude-3-5-sonnet-20240620-v1:0) instead of the regular model ID.