Manage a model tuning job

Model tuning, also known as hyperparameter optimization, is the process of adjusting the hyperparameters of a machine learning model to improve its performance. Hyperparameters are settings that determine the learning process of a model and are not learned from the data itself. By carefully tuning these parameters, you can significantly enhance the model's accuracy, precision, and overall effectiveness. After creating a model tuning job, follow these steps to seamlessly manage your model tuning job:

Step 1: Accessing Model Training

Dashboard: From the Greennode AI Platform dashboard, locate the "Model tuning" section
Model tuning List: You'll see a list of your existing model tuning job, including their names, creation dates, status (e.g., running, stopped), and configuration.

Step 2: Monitoring a Model Tuning

After creating your model tuning, you can access it by clicking on its name. This will take you directly to the model tuning detail interface where you can begin monitoring it.

The model tuning job details page provides comprehensive information about your tuning job's progress, resource usage, and logs. It's divided into three main sections:

General Information

Job Name: The unique identifier for the tuning job.
Job Status: The current status of the job (e.g., running, completed, failed).
Instance Type: The type of computing instances used for training.
Instance Count: The number of instances used for parallel training.
Network Volume: The network volume where the datasets and model artifacts are stored.

Usage Information

Use this tab to verify that your datasets are correctly mounted and monitor training logs to track progress, identify potential issues, and debug errors. System logs can be helpful for troubleshooting infrastructure-related problems. This section has two tabs:

Data Mount: Displays information about the folder containing input datasets and output model that you've mounted to your tuning job from network volume
Logs: Access detailed logs to troubleshoot issues and gain insights into the training process.

Step 3: Stop Training Job

Caution: Once a tuning job is stopped, it cannot be restarted.

Locate the Job: Find the specific tuning job you want to stop in the job list.
Click the "Stop" Button: Locate the "Stop" button or similar action and click it.
Confirm the Action: A confirmation prompt may appear. Confirm that you want to stop the job.
Monitor the Job Status: The job status should change to "Stopped" once the process is complete.

Step 4: Delete Model Tuning Job

Locate the Job: Find the specific tuning job you want to delete in the job list.
Click the "Delete" Button: Locate the "Delete" button or similar action and click it.
Confirm the Action: A confirmation prompt may appear. Confirm that you want to delete the job.
Verify Deletion: The job should be removed from the job list, and all associated resources should be released.

Related Articles
Create a tuning job
To create a supervised tuning job, you'll need to provide the following information: Steps to Create a Supervised Tuning Job Access the Tuning Job Creation Interface: Use the provider's platform through the url: . Fill in the Input Parameters: ...
Manage a model endpoint
This guide will walk you through the key features and steps involved in deploying your models, optimizing costs through undeployment, and removing endpoints when they are no longer needed. After creating a model endpoint, follow these steps to ...
Prepare Dataset for Model Tuning
The GreenNode format is a specialized structure tailored for Model Tuning, offering flexibility and scalability to ensure seamless compatibility. GreenNode format structures data to include roles, content, and optional system messages, ensuring ...
Local Storage Limits for Notebook, Model Training, and Online Prediction
To ensure optimal performance and cost-efficiency, our platform provides a certain amount of local storage included with each compute instance you create. However, exceeding this storage limit can impact your workflow and results. This guide will ...
Import a Model Registry using Triton Server
Model Preparation Since our AI Platform only accesses models from a Network Volume, you must first create a Network Volume. Pull your model from local file systems or cloud storage (AWS S3, Azure Blob, or GCS) into the Network Volume. Ensure the ...

Manage a model tuning job

Manage a model tuning job

Step 1: Accessing Model Training

Step 2: Monitoring a Model Tuning

Step 3: Stop Training Job

Step 4: Delete Model Tuning Job

Related Articles

Create a tuning job

Manage a model endpoint

Prepare Dataset for Model Tuning

Local Storage Limits for Notebook, Model Training, and Online Prediction

Import a Model Registry using Triton Server