Guides
Automated Scheduling for Model Endpoint
The Automated Scheduling feature allows users to define specific times for automatically starting or stopping Model Endpoint. This helps optimize cloud usage, reduce unnecessary costs, and ensure that compute resources are only active when needed. ...
Whitelist IPs for Model Endpoint
The Whitelist IP feature enhances the security of Model Endpoints by allowing customers to define a list of trusted client IP addresses using CIDR notation. Only traffic from these authorized IPs is permitted to access the model endpoint, ensuring ...
Set up resource scalability
When creating an endpoint, users are allowed to set up threshold configurations for online prediction and manage scalability based on CPU, RAM, GPU utilization, and response latency, navigate to “Replica Configuration” on the endpoint creation page ...
Manage a model endpoint
This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to ...
Create an endpoint
After training and registering the model, the online prediction component enables the deployment and serving of models to make real-time predictions or inferences on new data. This component provides endpoints or APIs that can be integrated into ...