Import a Model Registry using vLLM
Step 1: Accessing the Model Registry
- Log in to your GreenNode AI Platform account and navigate to the Model Registry Dashboard.
- Find and click on the "Import a model registry" button.
Step 2: Import a Model Registry
- Location & Model registry name: Select the location & a specific name for this model.
- Container: Select the Pre-built container option to use as a supported framework.
- Framework: Choose a model deployment framework & suitable version that meets your requirements. In this tutorial, we select vLLM 0.7.1
- Model source: Access to model stored: from our network volume, from the GreenNode catalog, or directly from Hugging Face.
- If select from HuggingFace, it is recommended to select a network volume. Choose a data source where the model is already cached to speed up loading. This ensures the model is cached, reducing loading time for future use.
- vLLM Settings: Configure parameters for the vLLM server
- Served model name: The model name used in the API. Noted that this name will also be used in model_name tag content of prometheus metrics.
- Max number of sequences: Maximum number of sequences per iteration. Default 256
- Max Context Length: Model context length. If unspecified, will be automatically derived from the model config.
- If Enable handling of LoRA adapters:
- LoRA modules: The name and path of each LoRA module you want to use
- Click the “Import” button to complete the process.
Related Articles
Import a Model Registry using Triton Server
Model Preparation Since our AI Platform only accesses models from a Network Volume, you must first create a Network Volume. Pull your model from local file systems or cloud storage (AWS S3, Azure Blob, or GCS) into the Network Volume. Ensure the ...
Import a Model Registry using NVIDIA NIM
Step 1: Accessing the Model Registry Log in to your GreenNode AI Platform account and navigate to the Model Registry Dashboard Model Registry Dashboard at: . Find and click on the "Import a model registry" button. Step 2: Import a Model Registry ...
Import a Model Registry
The model registry is a centralized repository for storing trained models, their metadata, versions, and associated artifacts. It allows for version control, management, and organization of models developed during the training phase. This enables ...
Import a model registry with pre-built container
Model Preparation Ensure your machine learning model is packaged and stored in a container image compatible with Triton Inference Server. Upload the container image containing your model to a storage location accessible by our AI Platform. The online ...
Import a model registry with custom container
The model registry is a centralized repository for storing trained models, their metadata, versions, and associated artifacts. It allows for version control, management, and organization of models developed during the training phase. This enables ...