Manage a model tuning job
Model tuning, also known as hyperparameter optimization, is the process of adjusting the hyperparameters of a machine learning model to improve its performance. Hyperparameters are settings that determine the learning process of a model and are not learned from the data itself. By carefully tuning these parameters, you can significantly enhance the model's accuracy, precision, and overall effectiveness. After creating a model tuning job, follow these steps to seamlessly manage your model tuning job:
Step 1: Accessing Model Training
- Dashboard: From the Greennode AI Platform dashboard, locate the "Model tuning" section
- Model tuning List: You'll see a list of your existing model tuning job, including their names, creation dates, status (e.g., running, stopped), and configuration.
Step 2: Monitoring a Model Tuning
After creating your model tuning, you can access it by clicking on its name. This will take you directly to the model tuning detail interface where you can begin monitoring it.
The model tuning job details page provides comprehensive information about your tuning job's progress, resource usage, and logs. It's divided into three main sections:
General Information
- Job Name: The unique identifier for the tuning job.
- Job Status: The current status of the job (e.g., running, completed, failed).
- Instance Type: The type of computing instances used for training.
- Instance Count: The number of instances used for parallel training.
- Network Volume: The network volume where the datasets and model artifacts are stored.
Usage Information
Use this tab to verify that your datasets are correctly mounted and monitor training logs to track progress, identify potential issues, and debug errors. System logs can be helpful for troubleshooting infrastructure-related problems. This section has two tabs:
- Data Mount: Displays information about the folder containing input datasets and output model that you've mounted to your tuning job from network volume
- Logs: Access detailed logs to troubleshoot issues and gain insights into the training process.
Step 3: Stop Training Job
Caution: Once a tuning job is stopped, it cannot be restarted.
- Locate the Job: Find the specific tuning job you want to stop in the job list.
- Click the "Stop" Button: Locate the "Stop" button or similar action and click it.
- Confirm the Action: A confirmation prompt may appear. Confirm that you want to stop the job.
- Monitor the Job Status: The job status should change to "Stopped" once the process is complete.
Step 4: Delete Model Tuning Job
- Locate the Job: Find the specific tuning job you want to delete in the job list.
- Click the "Delete" Button: Locate the "Delete" button or similar action and click it.
- Confirm the Action: A confirmation prompt may appear. Confirm that you want to delete the job.
- Verify Deletion: The job should be removed from the job list, and all associated resources should be released.
Related Articles
Create a tuning job
To create a supervised tuning job, you'll need to provide the following information: Steps to Create a Supervised Tuning Job Access the Tuning Job Creation Interface: Use the provider's platform through the url: . Fill in the Input Parameters: ...
Manage a model endpoint
This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to ...
Start your Model Training Job
Model training job involves using datasets to create and optimize machine learning models. This process occurs in the cloud environment, where data scientists run code to build models and tune hyperparameters. Training uses computational resources ...
Prepare Dataset for Model Tuning
The GreenNode format is a specialized structure tailored for Model Tuning, offering flexibility and scalability to ensure seamless compatibility. GreenNode format structures data to include roles, content, and optional system messages, ensuring ...
Import a Model Registry
The model registry is a centralized repository for storing trained models, their metadata, versions, and associated artifacts. It allows for version control, management, and organization of models developed during the training phase. This enables ...