Pipeline for Training Custom YOLO-NAS Object Detection Models

→ Step-by-step guide for training YOLO-NAS object detection models in PyTorch using custom datasets.

Every year, new YOLO versions are published, and last I saw, there was something like YOLOv12. If you have used YOLO models before, you probably used models like YOLOv5, YOLOv8, or something like YOLOv{some number}.

There are different YOLO models you probably haven’t heard about or have heard but haven’t used, like YOLOX and YOLO-NAS. Now, I will share with you my pipeline to train custom YOLO-NAS models and how to make predictions with the trained model.

Step-by-step Guide for Training a Custom YOLO-NAS Object Detection Model in Python

YOLO-NAS has an important advantage compared to other YOLO variants, and it is licensing. If you are planning to use your YOLO model commercially, YOLO-NAS might be a good choice. I don’t want to give misinformation, but I know it has advantages. For better information, you can visit its  Github page.

Dataset Preparation & Environment

Now, I will train a model to show how the pipeline works. I chose a small  dataset from Roboflow. You can randomly choose any dataset and follow this pipeline because the steps are the same. Don’t forget to export the dataset in COCO format.

I have a GPU supported PyTorch environment, and I will train my model locally on my computer. If you don’t have a GPU supported environment, you can use Kaggle or Google Colab, it will save a huge amount of time for you.
If you want to create a GPU-supported PyTorch environment, you can watch this video.

Comparision of YOLO-NAS with other YOLO Object Detection Models

There are 6 main steps:

  1. Install necessary libraries
  2. Define Dataset Parameters
  3. Create train and validation sets
  4. Load YOLO-NAS Model with Pretrained Weights
  5. Configure Training Parameters, Loss, and Metrics
  6. Train the model

1. Install Necessary Libraries & Set Directories

!pip install super_gradients
import torch

# Check if GPU is available 
DEVICE = 'cuda' if torch.cuda.is_available() else "cpu"

# base directory
HOME="C:/ml_dl_cv_files/ObjectDetection-Yolo-TF-Models/Custom-YOLONAS-model"
EXPERIMENT_NAME="yolonas-m-model-1--20epoch"
CHECKPOINT_DIR = f'{HOME}/checkpoints'

2. Define Dataset Parameters

Initialize the Trainer object and create dataset parameters. You need to set the paths for the dataset and set class names of model.

from super_gradients.training import Trainer

trainer = Trainer(experiment_name=EXPERIMENT_NAME, ckpt_root_dir=CHECKPOINT_DIR)

# Dataset , Label information
dataset_params = {
    'data_dir': "C:/ml_dl_cv_files/ObjectDetection-Yolo-TF-Models/Custom-YOLONAS-model/dataset",
    'train_images_dir':'train/images',
    'train_labels_dir':'train/labels',
    'val_images_dir':'valid/images',
    'val_labels_dir':'valid/labels',
    'classes': ['futbol', 'player', 'referree'] # labels here 
}

Create Train and Validation sets

from super_gradients.training.dataloaders.dataloaders import (
    coco_detection_yolo_format_train, coco_detection_yolo_format_val)

# you can increase this number depending on your GPU, more batch size means faster training
BATCH_SIZE = 4  

train_data = coco_detection_yolo_format_train(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['train_images_dir'],
        'labels_dir': dataset_params['train_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

val_data = coco_detection_yolo_format_val(
    dataset_params={
        'data_dir': dataset_params['data_dir'],
        'images_dir': dataset_params['val_images_dir'],
        'labels_dir': dataset_params['val_labels_dir'],
        'classes': dataset_params['classes']
    },
    dataloader_params={
        'batch_size': BATCH_SIZE,
        'num_workers': 2
    }
)

4. Load Pretrained YOLO-NAS Model

In general, larger models give more accuracy, but they work slower than light models.
I will use the YOLO-NAS M model; you can choose whichever model you want. If you follow this pipeline, it will work without any problem.

from super_gradients.training import models

MODEL_ARCH = "yolo_nas_m" 

model = models.get(
    MODEL_ARCH,  # yolo_nas_m
    num_classes=len(dataset_params['classes']),
    pretrained_weights="coco"
)
Pretrained YOLO-NAS Object Detection Models

Configure Training Parameters, Loss, and Metrics

Decide on the training parameters (I set mixed_precision to False because it was causing the model metrics to be NaN, and it was probably because my GPU’s memory is 6GB, which is not enough. Depending on your GPU, you can set ‘mixed_precision’ to True).

from super_gradients.training.losses import PPYoloELoss
from super_gradients.training.metrics import DetectionMetrics_050
from super_gradients.training.models.detection_models.pp_yolo_e import PPYoloEPostPredictionCallback

# Epoch Number
MAX_EPOCHS = 20

train_params = {
    'silent_mode': False,
    "average_best_models":True,
    "warmup_mode": "linear_epoch_step",
    "warmup_initial_lr": 1e-6,
    "lr_warmup_epochs": 3,
    "initial_lr": 5e-4,
    "lr_mode": "cosine",
    "cosine_final_lr_ratio": 0.1,

    "optimizer": "Adam",
    "optimizer_params": {"weight_decay": 0.0001},
    "zero_weight_decay_on_bias_and_bn": True,
    "ema": True,
    "ema_params": {"decay": 0.9, "decay_type": "threshold"},
    "max_epochs": MAX_EPOCHS,
    "mixed_precision":  False , # TRUE BY DEFAULT , depending to GPU setting this to True might cause nan value problem in metrics
    "loss": PPYoloELoss(
        use_static_assigner=False,
        num_classes=len(dataset_params['classes']),
        reg_max=16
    ),
    "valid_metrics_list": [
        DetectionMetrics_050(
            score_thres=0.1,
            top_k_predictions=300,
            num_cls=len(dataset_params['classes']),
            normalize_targets=True,
            post_prediction_callback=PPYoloEPostPredictionCallback(
                score_threshold=0.01,
                nms_top_k=1000,
                max_predictions=300,
                nms_threshold=0.7
            )
        )
    ],
    "metric_to_watch": 'mAP@0.50'
}

7. Train the model

trainer.train(
    model=model,
    training_params=train_params,
    train_loader=train_data,
    valid_loader=val_data
)
Training a Custom YOLO-NAS Object Detection Model

Make Predictions Trained with Model

First, load trained YOLO-NAS object detection model

from super_gradients.training import models

best_model = models.get(
    MODEL_ARCH,
    num_classes=len(dataset_params['classes']),
    checkpoint_path="C:/ml_dl_cv_files/ObjectDetection-Yolo-TF-Models/Custom-YOLONAS-model/checkpoints/yolonas-demo-m-1/RUN_20240927_182458_276109/average_model.pth"
).to(DEVICE)

Make a prediction on an image, dont forget to change image_path, and you can change conf value

import cv2

image_path=r"image.jpeg"
image = cv2.imread(image_path)

# predict
model_result = best_model.predict(image, conf=0.5)

print(model_result.prediction)
Output of YOLO-NAS Object Detection Model

Now display bounding boxes and labels

import cv2
import matplotlib.pyplot as plt

label_dict={0:"futbol",1:"player",2:"referree"}

# Load the image (replace with your image loading method)
image = cv2.imread(image_path)

# Bounding boxes, labels, confidence, and label dictionary
bboxes = model_result.prediction.bboxes_xyxy
confidences = model_result.prediction.confidence
labels = model_result.prediction.labels

# Draw bounding boxes and labels on the image
for i in range(len(bboxes)):
    bbox = bboxes[i]
    confidence = confidences[i]
    label = labels[i]
    
    # Coordinates of the bounding box
    x1, y1, x2, y2 = [int(coord) for coord in bbox]

    # Draw the rectangle
    cv2.rectangle(image, (x1, y1), (x2, y2), color=(0, 255, 0), thickness=2)

    # Create label text with confidence
    label_text = f"{label_dict[label]}: {confidence:.2f}"

    # Put the label text above the bounding box
    cv2.putText(image, label_text, (x1, y1 - 10), cv2.FONT_HERSHEY_SIMPLEX, 
                fontScale=0.5, color=(255, 255, 255), thickness=1, lineType=cv2.LINE_AA)

# Convert BGR to RGB for displaying in matplotlib
image_rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)

# Display the image using matplotlib
plt.figure(figsize=(10, 10))
plt.imshow(image_rgb)
plt.axis('off')  # Turn off axis
plt.show()
Object Detection with Custom YOLO-NAS Object Detection Model