Running YOLO Models in C++ for Real-Time Object Detection

→ Step by step guide for YOLO object detection models in C++.

Running and training object detection models in Python has become quite easy thanks to user-friendly libraries like Ultralytics, but what about running YOLO models in C++? There is a lot of application that uses C++ for computer vision especially when performance matters, and it is important to learn using YOLO models with C++.

Now, I will share with you a step by step guide to running YOLO models with C++ by using only the OpenCV library.

Running YOLO Models in C++ for Real-Time Object Detection — Running yolov5 Object Detection Model in C++

This article is about how to run YOLOv5 models on CPU, not GPU. Running models on GPU requires installing CUDA, CUDNN, and other things that can be confusing to install. I will write another article in future about how to run YOLO models with CUDA support.

It is important to know that for higher FPS, you should run your models with CUDA support.

For now, you only need to install the OpenCV library. If you haven’t installed it, you can install it from this link.

Okay, lets start.

1. Clone ultralytics/yolov5 repositroy

There are different model formats, and ONNX is the most popular one. You can see all kinds of models like object detection, image segmentation, image classification models in ONNX format.

Now, create a new folder and clone the yolov5 repositroy from terminal.

git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt

I will use pretrained yolov5s.pt model, but you can use your custom yolov5 models; the process doesn’t change. You can download pretrained models from this link .

If you don’t want to use pretrained models or if you want to train your own custom models, I have an article about it; you can read it

Pipeline For Training Custom YOLO Object Detection Models

Now let’s export YOLO model to ONNX format. There are different parameters; you can check the image below. You can edit export.py file(yolov5/export.py) or manually change parameters from terminal just like I did here. For custom models, you need to change the weights to your custom model weights (your_model.pt file).

 python yolov5/export.py --weights yolov5s.pt --img 640 --include onnx --opset 12

Important Note: You need to set opset to 12 here; otherwise, it will probably give an error. This is a common issue, and you can check GitHub to learn more about this error.

2. Create a TXT file to Store YOLO Model Labels

This step is quite easy; you just need to a create txt file for storing labels. If you’re using a pretrained YOLO model like me, you can download the txt file directly from this link.

If you have a custom model, then create a new txt file and write your labels in it in the same format as the image below. You can name this file whatever you want; it doesn’t matter, just don’t forget to change the file name when needed.

3. Create CMakeLists file

Now, let’s create a CMakeLists.txt file. This file is required when using CMake to compile a C++ program. If you installed OpenCV from the link that I shared, you already have CMake installed.

Dont forget to change:

Folder name
Path to OpenCV in your system
File name

cmake_minimum_required(VERSION 3.10)
project(cpp-yolo-detection) # your folder name here

# Find OpenCV
set(OpenCV_DIR C:/Libraries/opencv/build) # path to opencv
find_package(OpenCV REQUIRED)

add_executable(object-detection object-detection.cpp) # your file name

# Link OpenCV libraries
target_link_libraries(object-detection ${OpenCV_LIBS})

4. CODE

Finally, this is the last step. I used code from this repository, but I modified some parts and added comments to help you understand it better.

#include <fstream>
#include <opencv2/opencv.hpp>


// Load labels from coco-classes.txt file
std::vector<std::string> load_class_list()
{
    std::vector<std::string> class_list;
    // change this txt file  to your txt file that contains labels 
    std::ifstream ifs("C:/Users/sirom/Desktop/cpp-ultralytics/coco-classes.txt");
    std::string line;
    while (getline(ifs, line))
    {
        class_list.push_back(line);
    }
    return class_list;
}

// Model 
void load_net(cv::dnn::Net &net)
{   
    // change this path to your model path 
    auto result = cv::dnn::readNet("C:/Users/sirom/Desktop/cpp-ultralytics/yolov5s.onnx");

    std::cout << "Running on CPU/n";
    result.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
    result.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
 
    net = result;
}

const std::vector<cv::Scalar> colors = {cv::Scalar(255, 255, 0), cv::Scalar(0, 255, 0), cv::Scalar(0, 255, 255), cv::Scalar(255, 0, 0)};

// You can change this parameters to obtain better results
const float INPUT_WIDTH = 640.0;
const float INPUT_HEIGHT = 640.0;
const float SCORE_THRESHOLD = 0.5;
const float NMS_THRESHOLD = 0.5;
const float CONFIDENCE_THRESHOLD = 0.5;

struct Detection
{
    int class_id;
    float confidence;
    cv::Rect box;
};

// yolov5 format
cv::Mat format_yolov5(const cv::Mat &source) {
    int col = source.cols;
    int row = source.rows;
    int _max = MAX(col, row);
    cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
    source.copyTo(result(cv::Rect(0, 0, col, row)));
    return result;
}

// Detection function
void detect(cv::Mat &image, cv::dnn::Net &net, std::vector<Detection> &output, const std::vector<std::string> &className) {
    cv::Mat blob;

    // Format the input image to fit the model input requirements
    auto input_image = format_yolov5(image);
    
    // Convert the image into a blob and set it as input to the network
    cv::dnn::blobFromImage(input_image, blob, 1./255., cv::Size(INPUT_WIDTH, INPUT_HEIGHT), cv::Scalar(), true, false);
    net.setInput(blob);
    std::vector<cv::Mat> outputs;
    net.forward(outputs, net.getUnconnectedOutLayersNames());

    // Scaling factors to map the bounding boxes back to original image size
    float x_factor = input_image.cols / INPUT_WIDTH;
    float y_factor = input_image.rows / INPUT_HEIGHT;
    
    float *data = (float *)outputs[0].data;

    const int dimensions = 85;
    const int rows = 25200;
    
    std::vector<int> class_ids; // Stores class IDs of detections
    std::vector<float> confidences; // Stores confidence scores of detections
    std::vector<cv::Rect> boxes;   // Stores bounding boxes

   // Loop through all the rows to process predictions
    for (int i = 0; i < rows; ++i) {

        // Get the confidence of the current detection
        float confidence = data[4];

        // Process only detections with confidence above the threshold
        if (confidence >= CONFIDENCE_THRESHOLD) {
            
            // Get class scores and find the class with the highest score
            float * classes_scores = data + 5;
            cv::Mat scores(1, className.size(), CV_32FC1, classes_scores);
            cv::Point class_id;
            double max_class_score;
            minMaxLoc(scores, 0, &max_class_score, 0, &class_id);

            // If the class score is above the threshold, store the detection
            if (max_class_score > SCORE_THRESHOLD) {

                confidences.push_back(confidence);
                class_ids.push_back(class_id.x);

                // Calculate the bounding box coordinates
                float x = data[0];
                float y = data[1];
                float w = data[2];
                float h = data[3];
                int left = int((x - 0.5 * w) * x_factor);
                int top = int((y - 0.5 * h) * y_factor);
                int width = int(w * x_factor);
                int height = int(h * y_factor);
                boxes.push_back(cv::Rect(left, top, width, height));
            }
        }

        data += 85;
    }

    // Apply Non-Maximum Suppression
    std::vector<int> nms_result;
    cv::dnn::NMSBoxes(boxes, confidences, SCORE_THRESHOLD, NMS_THRESHOLD, nms_result);

    // Draw the NMS filtered boxes and push results to output
    for (int i = 0; i < nms_result.size(); i++) {
        int idx = nms_result[i];

        // Only push the filtered detections
        Detection result;
        result.class_id = class_ids[idx];
        result.confidence = confidences[idx];
        result.box = boxes[idx];
        output.push_back(result);

        // Draw the final NMS bounding box and label
        cv::rectangle(image, boxes[idx], cv::Scalar(0, 255, 0), 3);
        std::string label = className[class_ids[idx]];
        cv::putText(image, label, cv::Point(boxes[idx].x, boxes[idx].y - 5), cv::FONT_HERSHEY_SIMPLEX, 2, cv::Scalar(255, 255, 255), 2);
    }
}


int main(int argc, char **argv)
{   
    // Load class list 
    std::vector<std::string> class_list = load_class_list();

    // Load input image
    std::string image_path = cv::samples::findFile("C:/Users/sirom/Desktop/cpp-ultralytics/test2.jpg");
    cv::Mat frame = cv::imread(image_path, cv::IMREAD_COLOR);

    // Load the  modeL
    cv::dnn::Net net;
    load_net(net);

    // Vector to store detection results
    std::vector<Detection> output;
    // Run detection on the input image
    detect(frame, net, output, class_list);

    // Save the result to a file
    cv::imwrite("C:/Users/sirom/Desktop/cpp-ultralytics/result.jpg", frame);

    while (true)
    {       
        // display image
        cv::imshow("image",frame);

        // Exit the loop if any key is pressed
        if (cv::waitKey(1) != -1)
        {
            std::cout << "finished by user\n";
            break;
        }
    }

    std::cout << "Processing complete. Image saved /n";
    return 0;
}

5. Compile and run the code

mkdir build
cd build 
cmake ..
cmake --build .
.\Debug\object-detection.exe

1. Clone ultralytics/yolov5 repositroy

2. Create a TXT file to Store YOLO Model Labels

3. Create CMakeLists file

4. CODE

5. Compile and run the code

Related Posts

Pipeline for Training DETR Object Detection Models

Run Any Deep Learning Model with ONNX Runtime in Python

Pipeline for Training Faster R-CNN Object Detection Models with PyTorch

Trending now