Simple Example to run on LXPLUS-GPU#
A quick and short tutorial to set up an example backend on lxplus-GPU to understand the basics of the Tirton backend and client
This is based on the Nivida example tutorial for Pytorch backend and adopted to get it to work on LXPLUS-GPU node
More information for LXPLUS-GPU can be found here
# Connect to lxplus GPU mode
ssh {USER_NAME}@lxplus-gpu.cern.ch
# Create a work directory
mkdir TritonDemo
cd TritonDemo
# Clone the Official tutorial
git clone https://github.com/triton-inference-server/tutorials.git .
Note
The container images for the Triton Inference Server can be found Link
It will take time to pull all three images. One could use the existing images from /afs/cern.ch/work/y/yuchou/public/TritonDemo
If you want to pull the images, use the following command
Image folder, better to store in EOS to avoid disk quota issues on AFS
export IMAGE_FOLDER="/eos/user/{INITIAL}/{YOUR_ACCOUNT}/TritonDemo/"
Pull the image for the model
singularity pull --dir $IMAGE_FOLDER pull docker://nvcr.io/nvidia/pytorch:22.04-py3
Pull the image for the client
singularity pull --dir $IMAGE_FOLDER docker://docker.io/milescb/tritonserver-tutorial:22.04-py3
Pull the image for the Server
singularity pull --dir $IMAGE_FOLDER pull docker:/nvcr.io/nvidia/tritonserver:22.04-py3-sdk
Get the Pytorch resnet50 model#
Export model yourself
This step tries to get the resnet50 model in the pytorch .pt
files extension.
# Use the existing images from EOS and change to another path in case you download them yourself
export IMAGE_FOLDER="/afs/cern.ch/work/y/yuchou/public/TritonDemo"
# Cache directory
export SINGULARITY_CACHEDIR="/eos/user/{INITIAL}/{YOUR_ACCOUNT}/singularity/"
# Run the image
singularity run --nv -B /afs -B /eos -B /cvmfs ${IMAGE_FOLDER}/pytorch_22.04-py3.sif
# Move to the Pytorch tutorials folder
cd tutorials/Quick_Deploy/PyTorch
# Get the model.pt
python export.py
Note
You can copy it from /afs/cern.ch/work/y/yuchou/public/TritonDemo/tutorials/Quick_Deploy/PyTorch/model.pt
to the folder where you plan to store the PyTorch model.
cp /afs/cern.ch/work/y/yuchou/public/TritonDemo/tutorials/Quick_Deploy/PyTorch/model.pt .
Prepare the model configs and structure#
The model repository needs to fulfill certain structures and names, as shown in the following. More detailed information for other backends can be found in the official documentation.
models
|
+-- resnet50
|
+-- config.pbtxt
+-- 1
|
+-- model.pt
Set Up Triton Inference Server#
# Set cache directory
export SINGULARITY_CACHEDIR="/eos/user/{INITIAL}/{YOUR_ACCOUNT}/singularity/"
# Set model folder
export YOUR_MODEL_FOLDER="{YOUR_MODEL_FOLDER}"
export IMAGE_FOLDER="/afs/cern.ch/work/y/yuchou/public/TritonDemo"
# Run the container with triton server
singularity run --nv -e --no-home -B ${YOUR_MODEL_FOLDER}:/models ${IMAGE_FOLDER}/tritonserver_22.04-py3.sif
# Spin up a triton server
tritonserver --model-repository=/models
You should see the following printout on the terminal.
...
+----------+---------+--------+
| Model | Version | Status |
+----------+---------+--------+
| resnet50 | 1 | READY |
+----------+---------+--------+
...
Setup Client#
Open another terminal to run the client script and send the inference request.
Using the LXPLUS-GPU, we need to make sure to use the same machine as the one with the server to avoid the need to deal with authentication.
# Longin to same LXPLUS-GPU
# You need to replace the XXX with the same node number as the server above.
ssh {user_name}@lxplusXXX.cern.ch
export IMAGE_FOLDER="/afs/cern.ch/work/y/yuchou/public/TritonDemo/"
singularity run --nv -e -B /cvmfs:/cvmfs -B /afs/cern.ch/user/{INITIAL}:/home -B /afs/cern.ch/user/{INITIAL}/{YOUR_ACCOUNT}:/srv -B /afs:/afs -B /eos:/eos ${IMAGE_FOLDER}/tritonserver-tutorial_22.04-py3.sif
# Need to get the correct version of torch and torchvision
# You should install it elsewhere to avoid disk quota issues. Add --target /path/to/custom_directory to specify the directory.
# python -m pip install torchvision==0.17
# Download the input images
wget -O img1.jpg "https://www.hakaimagazine.com/wp-content/uploads/header-gulf-birds.jpg"
# Check if the connection is ok
curl -v localhost:8000/v2/health/ready
You should see the following message if the connection and the server are in a good state.
...
< HTTP/1.1 200 OK
< Content-Length: 0
< Content-Type: text/plain
...
Now, we are ready to run the client script!
# Move to the folder storing the script
cd TritonDemo/tutorials/Quick_Deploy/PyTorch/
# Straightforward script to send an image to server
python client.py
It will take some time, depending on the GPU utilization. But if everything goes well, you will be able to see the following printout.
[b'12.474469:90' b'11.525709:92' b'9.660509:14' b'8.406358:136'
b'8.220254:11']
The output is <confidence_score>:<classification_index>
Now, you get a triton client and server talking to each other!