This section explains the steps to add Google Vertex AI models and configure the required access controls.
1
Navigate to Google Vertex Models in AI Gateway
From the TrueFoundry dashboard, navigate to AI Gateway > Models and select Google Vertex.
2
Add Google Vertex Account and Authentication
Give a unique name to your Google Vertex account. This will be used to refer to the models later. Add collaborators to your account, this will give access to the account to other users/teams. Learn more about access control here.
Get Google Vertex Authentication Details
Required IAM RoleThe Google Cloud identity used by the gateway (a service account, whether referenced by a key, by GKE Workload Identity, or by Workload Identity Federation) must have the Agent Platform User role (roles/aiplatform.user, formerly Vertex AI User), which includes the aiplatform.endpoints.predict permission required by the gateway.The gateway supports three authentication methods. Pick the one that matches your deployment.1. Using Service Account JSON KeyThis method works for all deployment types (GKE, EKS, AKS, on-premises, or the SaaS Gateway).
Generate a Service Account JSON key by following the official Google Cloud documentation here.
The service account must have the Agent Platform User role (formerly Vertex AI User).
When adding the provider account in TrueFoundry, select Service account key file as the authentication type and paste the JSON key into the Service account key JSON field (or store it as a secret and reference it).
2. Using Workload Identity Federation (Keyless, Cross-Cloud)Workload Identity Federation (WIF) lets the gateway authenticate to Google Cloud without service account keys, even when running outside of GKE — for example, on Amazon EKS, Azure AKS, or on-premises Kubernetes clusters. It works by exchanging a short-lived Kubernetes service account token for a Google Cloud access token through Google’s Security Token Service.
Workload Identity Federation is the recommended approach for production deployments running outside of GKE. It eliminates long-lived service account keys while supporting any Kubernetes environment, and it also works on the SaaS version of the AI Gateway.
A Google Cloud IAM service account that the federated identity can impersonate, with the Agent Platform User role (roles/aiplatform.user, formerly Vertex AI User) granted on the project.
The Kubernetes service account used by the gateway must have permission to issue TokenRequest resources for itself. The TrueFoundry-provided Helm chart configures this RBAC automatically.
Generate the credential configuration JSONUse the gcloud CLI to generate the credential configuration file:
This produces a JSON file with "type": "external_account" describing the identity pool, audience, and STS token-exchange endpoints. It is not a private key.
Configure in TrueFoundryWhen adding or editing the Vertex AI provider account:
Select Workload Identity Federation file as the authentication type.
Paste the contents of the generated credential-config.json into the Key file content field, or store it as a secret and reference it.
Resumable file uploads (used for some batch and fine-tuning workflows that upload files to Google Cloud Storage via signed URLs) are not yet supported with Workload Identity Federation. If you rely on those flows, use a Service Account JSON key instead.
3. Using GCP Workload Identity on GKE (Self-Hosted Gateway only)When running the gateway inside Google Kubernetes Engine (GKE), you can rely on GKE’s built-in Workload Identity, which lets a Kubernetes service account (KSA) act as a Google Cloud IAM service account (GSA) automatically through the GKE metadata server.
GKE Workload Identity is GKE-specific. Pods using the configured KSA authenticate as the associated GSA when accessing Google Cloud APIs, with no extra configuration on the gateway side.
To set up GKE Workload Identity, follow the official Google Cloud documentation: Configure Workload Identity on GKE.When adding the Vertex AI provider account in TrueFoundry, leave the authentication section empty — the gateway will automatically pick up GKE Workload Identity credentials via Application Default Credentials (ADC).
GCP Workload Identity (GKE ADC) does not work on the SaaS version of the Gateway, and it only works when the gateway runs inside a GKE cluster. For all other environments, use Workload Identity Federation or a Service Account JSON key.
3
Configure Project ID and Region
Provide your Google Cloud Project ID and a default Region for all models under this account. You can override the region for individual models later.Project ID
You can find your Project ID in the top-right corner of your Google Cloud Console.
Region
Specify a default region for all models under this account. You can override this region for individual models later.
4
Add Models
You can either select available models from the list or add them manually by clicking + Add Model. When adding a model manually, the Model ID format depends on the provider.
Adding Google (Gemini) Models
Select a Gemini model from the list or add it manually.
Model ID Format: google/<vertex-model-id>
Example: google/gemini-1.5-pro
You can find the Model ID in the Google Cloud Console.
Adding Anthropic Models
Select a Claude model from the list or add it manually.
Model ID Format: anthropic/<vertex-model-id>
Example: anthropic/claude-3-5-sonnet-v2@20241022
Adding Mistral AI Models
Select a Mistral model from the list or add it manually.
Model ID Format: mistralai/<vertex-model-id>
Example: mistralai/mistral-large-2411@001
When adding any model manually, you can specify a Region to override the default one set at the account level.
Do I need to add multiple provider accounts for different regions?
No. You can set a default region at the account level and override it for each individual model if needed. This allows you to use models from different regions with a single provider account.
Which authentication method should I choose?
Service Account JSON Key — Works everywhere (any cloud, on-prem, SaaS Gateway). Simplest to set up, but requires you to manage and rotate a long-lived secret.
Workload Identity Federation — Recommended for production. Keyless, works on any Kubernetes cluster (EKS, AKS, GKE, on-prem) and on the SaaS Gateway. Requires a one-time setup of a Workload Identity Pool in Google Cloud.
GCP Workload Identity (GKE) — Only available when the self-hosted gateway runs inside a GKE cluster. Keyless and zero-config on the gateway side, but does not work on the SaaS Gateway or outside of GKE.
Service Account Key
Workload Identity Federation
GCP Workload Identity (GKE)
Works on GKE
Yes
Yes
Yes
Works on EKS / AKS / on-prem
Yes
Yes
No
Works on SaaS Gateway
Yes
Yes
No
Key management required
Yes
No
No
Requires credential JSON in TrueFoundry
Yes (service account key)
Yes (external_account config)
No (leave empty)
What is the difference between GCP Workload Identity and Workload Identity Federation?
Both are keyless authentication mechanisms, but they target different environments.GCP Workload Identity is a GKE-only feature. The GKE metadata server automatically maps a Kubernetes service account to a Google Cloud IAM service account. The gateway picks this up through Application Default Credentials (ADC) when no auth data is configured. It does not work on the SaaS Gateway or outside of GKE.Workload Identity Federation is a broader Google Cloud feature that works across any Kubernetes cluster (EKS, AKS, on-prem, and GKE) and on the SaaS Gateway. It requires you to provide an external_account credential configuration JSON (generated via gcloud iam workload-identity-pools create-cred-config). The gateway exchanges a short-lived Kubernetes service account token for a Google Cloud access token through Google’s Security Token Service.
How do I set up Workload Identity Federation for an EKS cluster? (Step-by-step example)
This example walks through the full setup of Workload Identity Federation to let a TrueFoundry service account running on Amazon EKS authenticate to Google Cloud. Replace the pool names, project IDs, OIDC issuer URI, namespaces, and service account names with your own values.Step 1 — Create a Workload Identity Pool
gcloud iam workload-identity-pools create <POOL_NAME> \ --location="global" \ --description="Workload identity pool for <YOUR_CLUSTER>" \ --display-name="<YOUR_CLUSTER>"
Step 2 — Create a Workload Identity ProviderThe --issuer-uri must be the OIDC issuer URL of your EKS cluster. You can find it in the AWS EKS console or via aws eks describe-cluster. The --attribute-condition restricts which Kubernetes service accounts can use this provider.
gcloud iam service-accounts create <GSA_NAME> \ --project="<GCP_PROJECT_ID>" \ --display-name="<GSA_DISPLAY_NAME>"
Step 4 — Grant the Service Account the Required RoleGrant the Agent Platform User role (formerly Vertex AI User, or whichever role your workload needs) to the service account:
Step 5 — Allow the Federated Identity to Impersonate the Service AccountGrant the roles/iam.workloadIdentityUser role so the Kubernetes service account (via the workload identity pool) can impersonate the Google Cloud service account:
gcloud iam service-accounts add-iam-policy-binding <GSA_EMAIL> \ --member="principal://iam.googleapis.com/projects/<PROJECT_NUMBER>/locations/global/workloadIdentityPools/<POOL_NAME>/subject/system:serviceaccount:<NAMESPACE>:<KSA_NAME>" \ --role="roles/iam.workloadIdentityUser"
Optionally, to allow all service accounts in a namespace (instead of a single one), use principalSet:
gcloud iam service-accounts add-iam-policy-binding <GSA_EMAIL> \ --member="principalSet://iam.googleapis.com/projects/<PROJECT_NUMBER>/locations/global/workloadIdentityPools/<POOL_NAME>/attribute.namespace/<NAMESPACE>" \ --role="roles/iam.workloadIdentityUser"
Step 6 — Generate the Credential Configuration FileThis is the file you will paste into TrueFoundry when configuring the Vertex AI provider account.
The generated credential-configuration.json file is what you provide in TrueFoundry under Workload Identity Federation file when adding the Vertex AI provider account.
When should I use Gemini vs Vertex AI? What's the difference?
Gemini is generally recommended for individual developers and prototyping use cases, while Vertex AI is recommended for production and enterprise use cases.Vertex AI offers everything available in the Gemini API and more, including:
More secure auth using service accounts instead of API keys
A Model Garden that includes multiple third-party models