NVidia Triton Inference Server

Target Container

NVIDIA® Triton Inference Server (formerly NVIDIA TensorRT Inference Server) simplifies the deployment of AI models at scale in production. It is an open source inference serving software that lets teams deploy trained AI models from any framework (TensorFlow, TensorRT, PyTorch, ONNX Runtime, or a custom framework), from local storage or Google Cloud Platform or AWS S3 on any GPU- or CPU-based infrastructure (cloud, data center, or edge).

Questionnaire

Company/Vendor name: Nvidia Application name: Triton Inference Server Version: 20.12-py3 Supported release: Supported Release Application URL: https://developer.nvidia.com/nvidia-triton-inference-server source code: https://github.com/triton-inference-server/server container: https://ngc.nvidia.com/catalog/containers/nvidia:tritonserver Deployment method: helm chart Deployment method URL: https://developer.nvidia.com/blog/deploying-a-natural-language-processing-service-on-a-kubernetes-cluster-with-helm-charts-from-ngc/

Is this application containerized already? yes, but can be built from source

Additional details

Please provide any additional details.