Pipeline for Training Custom YOLO-NAS Object Detection Models

→ Step-by-step guide for training YOLO-NAS object detection models in PyTorch using custom datasets.

Every year, new YOLO versions are published, and last I saw, there was something like YOLOv12. If you have used YOLO models before, you probably used models like YOLOv5, YOLOv8, or something like YOLOv{some number}.

There are different YOLO models you probably haven’t heard about or have heard but haven’t used, like YOLOX and YOLO-NAS. Now, I will share with you my pipeline to train custom YOLO-NAS models and how to make predictions with the trained model.

Pipeline for Training Custom YOLO-NAS Object Detection Models — **Step-by-step Guide for Training a Custom YOLO-NAS Object Detection Model in Python**

YOLO-NAS has an important advantage compared to other YOLO variants, and it is licensing. If you are planning to use your YOLO model commercially, YOLO-NAS might be a good choice. I don’t want to give misinformation, but I know it has advantages. For better information, you can visit its Github page.

Also, I have a YouTube video about this article, you can watch it.

Dataset Preparation & Environment

Now, I will train a model to show how the pipeline works. I chose a small dataset from Roboflow. You can randomly choose any dataset and follow this pipeline because the steps are the same. If you want to use this code exactly, don’t forget to export the dataset in YOLO format. You can use different formats like COCO, but you need to change some parts at the beginning when preparing the data.

I have a GPU supported PyTorch environment, and I will train my model locally on my computer. If you don’t have a GPU supported environment, you can use Kaggle or Google Colab, it will save a huge amount of time for you.
If you want to create a GPU-supported PyTorch environment, you can watch this video.

There are 6 main steps:

Install necessary libraries
Define Dataset Parameters
Create train and validation sets
Load YOLO-NAS Model with Pretrained Weights
Configure Training Parameters, Loss, and Metrics
Train the model

1. Install Necessary Libraries & Set Directories

!pip install super_gradients

import torch
import os

# Check if GPU is available 
DEVICE = 'cuda' if torch.cuda.is_available() else "cpu"

# current directory
HOME = os.getcwd()
EXPERIMENT_NAME="yolonas-m-model-1-20epoch"
CHECKPOINT_DIR = f'{HOME}/checkpoints'

2. Define Dataset Parameters

Initialize the Trainer object and create dataset parameters. You need to set the paths for the dataset and set class names of model.

from super_gradients.training import Trainer

""" 
Trainer --> Initialize the Trainer for the YOLO-NAS model
Trainer is responsible for managing the training process, including data loading, model training, and evaluation.
trainer is an instance of the Trainer class from the super_gradients.training module.
"""
trainer = Trainer(experiment_name=EXPERIMENT_NAME, ckpt_root_dir=CHECKPOINT_DIR)
 
 
# Dataset , Label information
dataset_params = {
    'data_dir': "C:/ml_dl_cv_files/ObjectDetection-Yolo-TF-Models/Custom-YOLONAS-model/dataset",
    'train_images_dir':'train/images',
    'train_labels_dir':'train/labels',
    'val_images_dir':'valid/images',
    'val_labels_dir':'valid/labels',5
    'classes': ['player', 'referree', 'football'] # labels here 
}

Create Train and Validation sets

from super_gradients.training.dataloaders.dataloaders import (
    coco_detection_yolo_format_train, coco_detection_yolo_format_val)

# you can increase this number depending on your GPU, more batch size means faster training
BATCH_SIZE = 4  

train_data = coco_detection_yolo_format_train(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['train_images_dir'],
        'labels_dir': dataset_params['train_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

val_data = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['val_images_dir'],
        'labels_dir': dataset_params['val_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

4. Load Pretrained YOLO-NAS Model

In general, larger models give more accuracy, but they work slower than light models.
I will use the YOLO-NAS M model; you can choose whichever model you want. If you follow this pipeline, it will work without any problem.

from super_gradients.training import models

MODEL_ARCH = "yolo_nas_m" 

model = models.get(
    MODEL_ARCH,  # yolo_nas_m
    num_classes=len(dataset_params['classes']),
    pretrained_weights="coco"
)

Configure Training Parameters, Loss, and Metrics

Decide on the training parameters (I set mixed_precision to False because it was causing the model metrics to be NaN, and it was probably because my GPU’s memory is 6GB, which is not enough. Depending on your GPU, you can set ‘mixed_precision’ to True).

from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback

# Epoch Number
MAX_EPOCHS = 20

train_params = {
    'silent_mode': False,
    "average_best_models":True,
    "warmup_mode": "linear_epoch_step",
    "warmup_initial_lr": 1e-6,
    "lr_warmup_epochs": 3,
    "initial_lr": 5e-4,
    "lr_mode": "cosine",
    "cosine_final_lr_ratio": 0.1,

    "optimizer": "Adam",
    "optimizer_params": {"weight_decay": 0.0001},
    "zero_weight_decay_on_bias_and_bn": True,
    "ema": True,
    "ema_params": {"decay": 0.9, "decay_type": "threshold"},
    "max_epochs": MAX_EPOCHS,
    "mixed_precision":  False , # TRUE BY DEFAULT , depending to GPU setting this to True might cause nan value problem in metrics
    "loss": PPYoloELoss(
        use_static_assigner=False,
        num_classes=len(dataset_params['classes']),
        reg_max=16
    ),
    "valid_metrics_list": [
        DetectionMetrics_050(
            score_thres=0.1,
            top_k_predictions=300,
            num_cls=len(dataset_params['classes']),
            normalize_targets=True,
            post_prediction_callback=PPYoloEPostPredictionCallback(
                score_threshold=0.01,
                nms_top_k=1000,
                max_predictions=300,
                nms_threshold=0.7
            )
        )
    ],
    "metric_to_watch": 'mAP@0.50'
}

7. Train the model

trainer.train(
    model=model,
    training_params=train_params,
    train_loader=train_data,
    valid_loader=val_data
)

Make Predictions Trained with Model

First, load trained YOLO-NAS object detection model

from super_gradients.training import models

best_model = models.get(
    MODEL_ARCH,
    num_classes=len(dataset_params['classes']),
    checkpoint_path="C:/ml_dl_cv_files/ObjectDetection-Yolo-TF-Models/Custom-YOLONAS-model/checkpoints/yolonas-demo-m-1/RUN_20240927_182458_276109/average_model.pth"
).to(DEVICE)

Make a prediction on an image, dont forget to change image_path, and you can change conf value

import cv2

image_path=r"image.jpeg"
image = cv2.imread(image_path)

# predict
model_result = best_model.predict(image, conf=0.5)

print(model_result.prediction)

Now display bounding boxes and labels

import cv2
import matplotlib.pyplot as plt

label_dict={0:"player",1:"referree",2:"football"}

# Load the image (replace with your image loading method)
image = cv2.imread(image_path)

# Bounding boxes, labels, confidence, and label dictionary
bboxes = model_result.prediction.bboxes_xyxy
confidences = model_result.prediction.confidence
labels = model_result.prediction.labels

# Draw bounding boxes and labels on the image
for i in range(len(bboxes)):
    bbox = bboxes[i]
    confidence = confidences[i]
    label = labels[i]
    
    # Coordinates of the bounding box
    x1, y1, x2, y2 = [int(coord) for coord in bbox]

    # Draw the rectangle
    cv2.rectangle(image, (x1, y1), (x2, y2), color=(0, 255, 0), thickness=2)

    # Create label text with confidence
    label_text = f"{label_dict[label]}: {confidence:.2f}"

    # Put the label text above the bounding box
    cv2.putText(image, label_text, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 
                fontScale=0.5, color=(255, 255, 255), thickness=1, lineType=cv2.LINE_AA)

# Convert BGR to RGB for displaying in matplotlib
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Display the image using matplotlib
plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)
plt.axis('off')  # Turn off axis
plt.show()

Dataset Preparation & Environment

1. Install Necessary Libraries & Set Directories

2. Define Dataset Parameters

Create Train and Validation sets

4. Load Pretrained YOLO-NAS Model

Configure Training Parameters, Loss, and Metrics

7. Train the model

Make Predictions Trained with Model

Related Posts

U2SEG: Unsupervised Universal Image Segmentation

Pipeline for Training U-Net Semantic Segmentation Models with PyTorch

CutLER: Unsupervised Object Detection and Instance Segmentation

Trending now