Deploy a model endpoint with custom container
In this section, you will learn how to deploy and serve your custom machine-learning models using a custom container in our Greennode AI Platform. Follow the steps below to configure and deploy your model for online prediction:
Step 0: Import a model registry with custom container
Step 1: Create an endpoint with a custom container model at Step 1
Location & Endpoint name: Select cloud location & a specific name for this endpoint prediction
Select a model to deploy this endpoint, which was registered as Model Registry in the previous steps.
Resource configure: Specify the CPU, GPU, and RAM configurations based on your workload.
Replica configuration: You can effectively specify the minimum and maximum replica counts for your service, enabling dynamic scaling to meet fluctuating demand while maintaining optimal performance and resource efficiency.
Minimum replica count: Determine the minimum number of replicas required to ensure adequate service availability and performance under normal operating conditions.
Maximum replica count: Identify the maximum number of replicas that the service can scale up to efficiently without compromising performance, resource availability, or cost constraints.
Advanced configuration: Specify the threshold of CPU, RAM, GPU Utilization & Response latency to define the maximum allowable usage of these resources.
Click the "Create endpoint" button to run your online prediction with the specified configurations at the bottom right corner.
Navigate to the Monitoring section to view logs generated during the online prediction process.

Yaml Example
Related Articles
Import a model registry with custom container
The model registry is a centralized repository for storing trained models, their metadata, versions, and associated artifacts. It allows for version control, management, and organization of models developed during the training phase. This enables ...
Manage a model endpoint
This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to ...
Whitelist IPs for Model Endpoint
The Whitelist IP feature enhances the security of Model Endpoints by allowing customers to define a list of trusted client IP addresses using CIDR notation. Only traffic from these authorized IPs is permitted to access the model endpoint, ensuring ...
Manage a Model Registry
The Model Registry serves as a centralized repository to track, organize, and manage your AI/ML models. It ensures reproducibility, versioning, and simplifies deployment. Follow this guide to understand how to manage your model registry and prepare ...
Import a model registry with pre-built container
Model Preparation Ensure your machine learning model is packaged and stored in a container image compatible with Triton Inference Server. Upload the container image containing your model to a storage location accessible by our AI Platform. The online ...