This guide describes how to deploy a Pytorch models deployed via Sagemaker endpoint in TrueFoundry. For this we will need to adapt the existing inference script in Sagemaker to the TrueFoundry platform.Documentation Index
Fetch the complete documentation index at: https://www.truefoundry.com/llms.txt
Use this file to discover all available pages before exploring further.
Existing Code
A Sagemaker deployment typically contains code in the form of the following file tree -- inference.py - This is the inference handler that implements the Sagemaker functions like model_fn, input_fn, predict_fn, output_fn etc
- requirements.txt - This contains any additional Python packages needed by the inference handler
- Model artifacts - Generated model files (e.g.
model.pth). These may reside on your S3 buckets. - Sagemaker deployment code (e.g.
sagemaker_deploy.py) - Code to call Sagemaker to deploy the model as an endpoint
Deploying the model on TrueFoundry
Broadly speaking these are the things we shall do -- Enclose the inference handler within a Docker container containing
torchserveto support pytorch-based models - Upload the model artifact as a TrueFoundry Artifact to make it accessible from the running container
- Launch a TrueFoundry deployment utilizing the above two pieces
Upload the Pytorch model artifacts to the TrueFoundry Model Registry
The existing model will look something like:Upload the model to the TrueFoundry Model registry either via code or UI.
- Upload via Code
- Upload Via UI
upload_model.py
-
Now let’s go ahead and write a
deploy.pyscript that can be used with TrueFoundry to get a service deployed. Here you’ll need to change the following- Service Name - Name for the service we’ll deploy
- Entrypoint Script Name (value for
SAGEMAKER_PROGRAM) - The code file name containingmodel_fn,input_fn,predict_fnandoutput_fn - Model Version FQN - The FQN obtained from
upload_model.py