Guides
Set up resource scalability
When creating an endpoint, users are allowed to set up threshold configurations for online prediction and manage scalability based on CPU, RAM, GPU utilization, and response latency, navigate to “Replica Configuration” on the endpoint creation page ...
Manage a model endpoint
This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to ...
Create an endpoint
After training and registering the model, the online prediction component enables the deployment and serving of models to make real-time predictions or inferences on new data. This component provides endpoints or APIs that can be integrated into ...