Skip to main content
Truefoundry provides different deployment options depending on your cloud provider, existing infrastructure and the components that you want to deploy. Before understanding the different deployment options, its important to read about the overall architecture here: control-plane architecture, compute-plane architecture, gateway-plane architecture.

Deployment Options

Deployment ScenarioModulesPricing TierRemarksCost of Hosting
1. AI Gateway SAAS onlyGatewayStarterUse managed AI Gateway for LLM requests, no compute plane or control plane hosting needed.0
2: Gateway Plane onlyGatewayEnterpriseHost the Gateway plane for LLM requests and use control-plane hosted by Truefoundry.~$600/month
3. Control Plane + Gateway PlaneAI GatewayEnterpriseUse only AI Gateway, full control, suitable if you don’t need compute plane/app deployment.~ $800-1000/month.
4. Compute Plane + AI Gateway SAASAI Deployment, AI GatewayProDeploy models/services (Compute Plane) on your own infrastructure while using managed AI GatewayFixed cost of ~ $200/month. Cost scales as compute plane scales.
5. Compute Plane onlyAI DeploymentProHost the compute plane for model/services deployment, control-plane hosted by Truefoundry.Fixed cost of ~$200/month. Cost scales as compute plane scales.
6. Compute Plane + Control PlaneAI DeploymentEnterpriseComplete self-hosted for AI Deployment, full control over infra, no Gateway product.Fixed cost ~ $600/month. Cost scales as compute plane scales.
7. Control Plane + Gateway Plane + Compute PlaneAI Deployment, AI GatewayEnterpriseFull flexibility, full control; self-host the entire platform.Fixed cost ~$1200/month. Cost scales as the compute plane scales and requests through gateway scales.
The key deployment options described above are:
This is a fully managed solution on Truefoundry’s secure cloud infrastructure with enterprise-grade features. You don’t need to deploy any infrastructure on your end. This includes usage of only the AI Gateway modle and not AI Deployment.
Truefoundry Managed SAAS Gateway Only

Truefoundry Managed SAAS

This is ideal for smaller, mid-size or entprises that want to use Truefoundry AI gateway without the operational overhead of self-hosting.
The key features and advantages of this deployment option are:
  1. Globally distributed gateway to minimize latency: Truefoundry gateway is deployed in multiple regions of the world across multiple zones and multiple cloud providers to provide low latency and high availability. Learn more about our globally distributed infrastructure.
  2. Zero Overhead of maintenance: There is no overhead of maintaining infrastructure and you can get access to the latest features and improvements.
  3. Data is encrypted at rest and in transit.
  4. Truefoundry Infrastructure is SOC2, ISO27001, GDPR, and HIPAA compliant
This is a deployment of the gateway plane on your own infrastructure. The control-plane is hosted by Truefoundry and it doesn’t include the AI Deployment module. The gateway exports the request-response data to the ingestion server which then stores the data in your own blob storage. The control-plane stores the metrics and has access to the bucket containing the request-response data.
Gateway Plane and Data Storage on your own infrastructure

Gateway Plane and Data Storage on your own infrastructure

The key features about this mode of deployment are:
  1. LLM Traffic stays within your own premises: All LLM traffic stays within your own infrastructure and Truefoundry doesn’t come into the live path of a request to LLM.
  2. You retain full control over your data: You retain full ownership of the request-response data since the its stored on a bucket on your end. The data are stored in parquet format - so you can use them for analytics, debugging and evaluation via Spark, DuckDB, Athena or any tool of your choice.
  3. Management of Gateway and Ingestion Service: The gateway and ingestion service availability needs to be managed on your end.
  4. Truefoundry control plane has access to the bucket containing the data: This access helps you browse the request logs on the truefoundry dashboard. The control-plane also need to access the logs to be able to compact it and create indexes on it to make it faster to query.
You will not be be able to use this feature if you don’t give access to Truefoundry control plane access to your bucket.
When you are browsing the request logs in the Truefoundry dashboard, the data will be fetched from your blob storage - so you might incur egress charges from your cloud provider. The data might be cached temporarily in the control-plane for faster queries. There are also egress charges when the control-plane compacts the logs and creates indexes on it.
Host both the gateway plane and the control plane on your own infrastructure.In this case, everything except the authentication server and analytics server, everything is hosted on your own infrastructure.
Control Plane, Gateway Plane and Data Storage on your own infrastructure

Control Plane, Gateway Plane and Data Storage on your own infrastructure

The only data sent to authentication/licensing server are the emails of the employees using the platform and the count of the requests flowing through the gateway. To understand how SSO works with our central authentication server, refer to this page. This helps us keep track of the licenses and billing.
Deploy models/services (Compute Plane) on your own infrastructure while using managed AI Gateway.
Compute Plane + AI Gateway SAAS

Compute Plane + AI Gateway SAAS

Host the compute plane for model/services deployment, control-plane hosted by Truefoundry. This doesn’t include the AI Gateway product.
Compute Plane only

Compute Plane only

Complete self-hosted for AI Deployment, full control over infra, no Gateway product.
Compute Plane + Control Plane

Compute Plane + Control Plane

Self-host the control-plane, gateway plane and compute plane all in your own infrastructure.
Control Plane + Gateway Plane + Compute Plane

Control Plane + Gateway Plane + Compute Plane

Understanding the Installation Process

Truefoundry software ships as a combination of Terraform code and helm charts. Your exact deployment process will depend on your current state of infrastructure and the modules you want to deploy. Here’s a brief overview of how the compute-plane, gateway-plane and control-plane are deployed.

Compute Plane

This comprises of a Kubernetes cluster and few add-ons on the Kubernetes cluster as described here. If don’t have a Kubernetes cluster, Truefoundry can also provide the terraform code to provision the cluster on AWS, GCP or Azure. The Terraform code brings up the cluster, installs the add-ons on the cluster, and creates roles to grant the control-plane permission to DockerRegistry, SecretsManager, BlobStorage and the Kubernetes cluster itself. Each of the addons is installed as an argo-application on the Kubernetes cluster. You can find the argocd application file for each of the addons below:
AddonStatusConfiguration File(s)
ArgoCDEssentialargocd.yaml
PrometheusEssentialprometheus.yaml
TFY AgentEssentialtfy-agent.yaml
IstioOptionalistio-base.yaml, istio-discovery.yaml, tfy-istio-ingress.yaml
ArgoRolloutsOptionalargo-rollouts.yaml
ArgoWorkflowsOptionalargo-workflows.yaml
KedaOptionalkeda.yaml
TFY LogsOptionaltfy-logs.yaml
GPU OperatorOptionalAWS, GCP, Azure, tfy-gpu-operator.yaml
GrafanaOptionalgrafana.yaml
KarpenterEssential (AWS Only)karpenter.yaml, karpenter-config.yaml
Metrics-ServerEssential (AWS Only)metrics-server.yaml
AWS EBS CSI DriverEssential (AWS Only)aws-ebs-csi-driver.yaml
AWS EFS CSI DriverOptional (AWS Only)aws-efs-csi-driver.yaml
AWS Load Balancer ControllerEssential (AWS Only)aws-load-balancer-controller.yaml
TFY Inferentia OperatorOptional (AWS Only)tfy-inferentia-operator.yaml
Cert-ManagerOptionalcert-manager.yaml
Once the addons are installed on the cluster, the management of the addons (upgrade, value changes) are done from the Truefoundry dashboard itself and terrraform is not required. Terraform is only used to upgrade the cluster or make other changes in the infrastructure. Compute Plane Addons Management
To make it easy to install all the addons in one go, Truefoundry provides an inframold helm chart for each cloud provider that contains the configuration of all the addons in one repository.AWS Inframold Helm Chart: tfy-k8s-aws-eks-inframoldGCP Inframold Helm Chart: tfy-k8s-gcp-gke-standard-inframoldAzure Inframold Helm Chart: tfy-k8s-azure-aks-inframoldGeneric Inframold Helm Chart: tfy-k8s-generic-inframold

Gateway Plane

The gateway plane ships as a single helm-chart that can be deployed on any Kubernetes cluster. The gateway has no external dependencies and only needs to be able to connect to the control-plane via a secure WebSocket connection. The gateway plane is also stateless and has no database or storage attached to it. The Truefoundry gateway helm chart is available in this Github repository: tfy-llm-gateway

Control Plane

The control-plane ships as a single helm-chart that can be deployed on any Kubernetes cluster. It also requires a PostgreSQL database to store the data and connection to a blob storage to store the data. You can either bring your own Kubernetes cluster, Postgres and blobstorage or Truefoundry can help provision the same using our Terraform code. The control-plane helm chart is available in this Github repository: truefoundry The control-plane helm chart includes the gateway helm chart as a dependency to make it easier to install both the control-plane and the gateway in one go.

Overview of Helm Charts in Truefoundry

The key helm-charts in Truefoundry and their composition and usage is as follows:
Helm ChartComponentDescription
truefoundryControl-Plane + Gateway (optionally)The control-plane helm chart. You only need this chart if you are self-hosting the control-plane.
tfy-llm-gatewayGateway OnlyThe gateway helm chart - you only need to install this if you are self-hosting the gateway.
tfy-k8s-aws-eks-inframoldAWS Compute-plane + Control-Plane(optionally)The AWS Inframold helm chart that contains all the addons in compute-plane and also the truefoundry control-plane. You can disable the control-plane installation if you are only installing the compute-plane
tfy-k8s-gcp-gke-standard-inframoldGCP Compute-plane + Control-Plane(optionally)The GCP Inframold helm chart that contains all the addons in compute-plane and also the truefoundry control-plane. You can disable the control-plane installation if you are only installing the compute-plane
tfy-k8s-azure-aks-inframoldAzure Compute-plane + Control-Plane(optionally)The Azure Inframold helm chart that contains all the addons in compute-plane and also the truefoundry control-plane. You can disable the control-plane installation if you are only installing the compute-plane
tfy-k8s-generic-inframoldOn-prem Compute-plane + Control-Plane(optionally)The Generic Inframold helm chart that contains all the addons in compute-plane and also the truefoundry control-plane. You can disable the control-plane installation if you are only installing the compute-plane
We have different inframold helm charts per cloud provider since there are differences in the addons required in the compute-plane depending on the cloud provider. For e.g, we have Karpenter, Inferentia Operator which are required in AWS only. Also the tolerations and affinity settings are different for each cloud provider.