Manage a model endpoint

This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to seamlessly manage your model endpoint:

Step 1: Accessing Model Endpoints

Dashboard: From the Greennode AI Platform dashboard, locate the Model Inference section.
Endpoint List: You'll see a list of your existing model endpoint, including their names, creation dates, status (e.g., running, stopped), and configuration.

Step 2: Accessing Running Model Endpoint

The Model Endpoint details page provides a comprehensive overview of your deployed model, including its configuration, resource usage, and performance metrics. This guide will walk you through the key sections and how to interpret the information presented.

General Information

Endpoint Name: The name of the endpoint through which your model is accessed for predictions.
Status: Indicates the current state of the endpoint (e.g., Creating, InService, Updating, Failed).
Creation Time: The timestamp when the endpoint was created.
Endpoint URL: A unique address that allows access to a deployed model for predictions or interactions, forwarding requests to http://localhost:internal_port within the endpoint pod
Location: Cloud location of this endpoint
Connection: provides instructions on how to access and interact with the Endpoint URL.
Instance Type: The type of compute instance running your model (e.g., CPU, GPU, memory-optimized).
Instance Count: The number of instances currently running to handle prediction requests.

Detail Information

Model name: The model deployed to this endpoint, previously registered in the Model Registry.
Model framework: The framework used to deploy the model. (Triton, NVIDIA NIM, vLLM, SGLang, …)
Model source: The location where the model is stored or retrieved from for deployment.
Model: The name of the deployed model.

Step 3: Deploy and Undeploy Model Endpoint

Deploy model endpoint

Once your model endpoint is created, it will automatically start. If it is currently stopped or failed , locate your model endpoint in the table and click on the "Deploy" button. This will initialize the model endpoint with the settings you've chosen or new resource configuration. Follow these steps below:

Locate the model endpoint you want to deploy in the model list.
Click the "Deploy" button and re-configure your endpoint (optional) based on your demands.
Wait for the status to change to "Running"

Undeploy model endpoint

Undeploying an endpoint is a cost-effective way to pause predictions when they are not needed. You avoid paying for resources while keeping your model ready for redeployment when demand resumes.

If you need to pause your work or save on resources, simply select the model endpoint you wish to undeploy and click the "Undeploy" button. This will halt the instance and save its state until you deploy it again.

Locate the model endpoint you want to undeploy in the list.
Click the "Undeploy" button.
Wait for the model endpoint status to change to "Undeployed."

Step 4: Delete an Endpoint

When an endpoint is no longer needed, you can delete it to free up resources. To delete an model endpoint, choose the endpoint and click the "Delete" button. A confirmation dialog will appear to ensure you do not accidentally delete the wrong endpoint. Please note that once an endpoint is deleted, it cannot be recovered.

Important Considerations

Undeploy vs. Delete:
- Undeploy: Use this option when you want to temporarily stop predictions but retain the model and configuration for later use.
- Delete: Use this option when you no longer need the endpoint and want to free up resources permanently.
Data Backup: Before deleting an endpoint, make sure you have backed up any important data or model artifacts.
Monitoring: Monitor your endpoint's performance metrics to ensure optimal resource utilization and cost management.

Related Articles
Create an endpoint
After training and registering the model, the online prediction component enables the deployment and serving of models to make real-time predictions or inferences on new data. This component provides endpoints or APIs that can be integrated into ...
Manage a model tuning job
Model tuning, also known as hyperparameter optimization, is the process of adjusting the hyperparameters of a machine learning model to improve its performance. Hyperparameters are settings that determine the learning process of a model and are not ...
Deploy a model endpoint with custom container
In this section, you will learn how to deploy and serve your custom machine-learning models using a custom container in our Greennode AI Platform. Follow the steps below to configure and deploy your model for online prediction: Step 0: Import a model ...
Local Storage Limits for Notebook, Model Training, and Online Prediction
To ensure optimal performance and cost-efficiency, our platform provides a certain amount of local storage included with each compute instance you create. However, exceeding this storage limit can impact your workflow and results. This guide will ...
Import a model registry with custom container
The model registry is a centralized repository for storing trained models, their metadata, versions, and associated artifacts. It allows for version control, management, and organization of models developed during the training phase. This enables ...

Manage a model endpoint

Manage a model endpoint

Step 1: Accessing Model Endpoints

Dashboard: From the Greennode AI Platform dashboard, locate the Model Inference section.Endpoint List: You'll see a list of your existing model endpoint, including their names, creation dates, status (e.g., running, stopped), and configuration.

Step 2: Accessing Running Model Endpoint

Step 3: Deploy and Undeploy Model Endpoint

Step 4: Delete an Endpoint

Related Articles

Create an endpoint

Manage a model tuning job

Deploy a model endpoint with custom container

Local Storage Limits for Notebook, Model Training, and Online Prediction

Import a model registry with custom container

Dashboard: From the Greennode AI Platform dashboard, locate the Model Inference section.
Endpoint List: You'll see a list of your existing model endpoint, including their names, creation dates, status (e.g., running, stopped), and configuration.