Truefoundry Overview

Truefoundry is a a cloud-agnostic platform for building, deploying and monitoring AI applications while enabling complete governance and security within an organization. It provides the following two key modules: AI Engineering and AI Gateway.

All the components in Truefoundry are modular and you can decide to use only the components you need. For e.g., if you don’t need the AI engineering module, you can just use the AI gateway module.

AI Engineering

AI engineering module primarily enables datascientists to deploy their models, agents and workflows on your own infrastructure while providing a single place to manage all the AI assets. It abstracts out the underlying infrastructure to enable rapid experimentation and deployment, while making sure it adheres to the guardrails and principles set by the organization.

Truefoundry doesn’t provide compute. You bring your own cloud account or on-prem hardware. Truefoundry will connect with it and enable you to deploy your models, agents and workflows. All the models and artifacts are also stored on your own storage.

Jupyter Notebooks / Remote SSH

Start Jupyter Notebooks or connect your existing IDE to remote compute on any cloud/on-prem hardware including GPUs.

Train Models / Batch Inference

Use the Jobs feature in Truefoundry to run training or batch inference jobs either manually or on a schedule.

Model Registry

Store and Version your models and artifacts.

Model Inference

Deploy your models in any framework (Transformers, PyTorch, TensorFlow, SkLearn, XGBoost, etc) as realtime APIs.

Workflows

Deploy and monitor complex ML pipelines.

Service Deployment

Deploy any service (REST, gRPC, etc) or your Streamlit, Gradio, Flask, FastAPI applications.

LLM Deployment

Deploy LLMs from HuggingFace/own model registry using vLLM / Sglang /TRTLLM with low latency, high througput and faster autoscaling.

LLM Finetuning

Finetune LLM models on your own data from the model catalogue or any HuggingFace LLM model.

Async Inference

Deploy async inference services backed by queue of your choice to handle inference with higher latency.

AI Gateway

The AI gateway provides a single interface to access all the LLMs and AI models, MCP servers and agents within an organization. It comes with access control, key management, governance and monitoring inbuilt which enables developers to focus on building great applications without worrying about the underlying models, keys, observability and platform teams to impose rate and budget limits, access control, audit and security guardrails.

LLM Gateway

Call 1000+ LLM models using a single API.

MCP Registry

Deploy MCP servers from the model catalogue.

MCP Gateway

Deploy MCP servers from the model catalogue.

Prompt Management

Create, store and version prompts and use them via the Gateway.

Tracing and Observability

Trace and monitor all requests across LLMs and MCP servers going through the Gateway.

Platform

AI Engineering

Jupyter Notebooks / Remote SSH

Train Models / Batch Inference

Model Registry

Model Inference

Workflows

Service Deployment

LLM Deployment

LLM Finetuning

Async Inference

AI Gateway

LLM Gateway

MCP Registry

MCP Gateway

Prompt Management

Tracing and Observability

Agent Gateway

Platform

Documentation Index

​AI Engineering

Jupyter Notebooks / Remote SSH

Train Models / Batch Inference

Model Registry

Model Inference

Workflows

Service Deployment

LLM Deployment

LLM Finetuning

Async Inference

​AI Gateway

LLM Gateway

MCP Registry

MCP Gateway

Prompt Management

Tracing and Observability

Agent Gateway

AI Engineering

AI Gateway