Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
Physical Address
304 North Cardinal St.
Dorchester Center, MA 02124
→ Step by step guide for YOLO object detection models in C++.
Running and training object detection models in Python has become quite easy thanks to user-friendly libraries like Ultralytics, but what about running YOLO models in C++? There is a lot of application that uses C++ for computer vision especially when performance matters, and it is important to learn using YOLO models with C++.
Now, I will share with you a step by step guide to running YOLO models with C++ by using only the OpenCV library.

This article is about how to run YOLOv5 models on CPU, not GPU. Running models on GPU requires installing CUDA, CUDNN, and other things that can be confusing to install. I will write another article in future about how to run YOLO models with CUDA support.
It is important to know that for higher FPS, you should run your models with CUDA support.
For now, you only need to install the OpenCV library. If you haven’t installed it, you can install it from this link.
Okay, lets start.
There are different model formats, and ONNX is the most popular one. You can see all kinds of models like object detection, image segmentation, image classification models in ONNX format.
Now, create a new folder and clone the yolov5 repositroy from terminal.
git clone https://github.com/ultralytics/yolov5
cd yolov5
pip install -r requirements.txt
I will use pretrained yolov5s.pt model, but you can use your custom yolov5 models; the process doesn’t change. You can download pretrained models from this link .
If you don’t want to use pretrained models or if you want to train your own custom models, I have an article about it; you can read it

Now let’s export YOLO model to ONNX format. There are different parameters; you can check the image below. You can edit export.py file(yolov5/export.py) or manually change parameters from terminal just like I did here. For custom models, you need to change the weights to your custom model weights (your_model.pt file).
python yolov5/export.py --weights yolov5s.pt --img 640 --include onnx --opset 12
Important Note: You need to set opset to 12 here; otherwise, it will probably give an error. This is a common issue, and you can check GitHub to learn more about this error.

This step is quite easy; you just need to a create txt file for storing labels. If you’re using a pretrained YOLO model like me, you can download the txt file directly from this link.
If you have a custom model, then create a new txt file and write your labels in it in the same format as the image below. You can name this file whatever you want; it doesn’t matter, just don’t forget to change the file name when needed.

Now, let’s create a CMakeLists.txt file. This file is required when using CMake to compile a C++ program. If you installed OpenCV from the link that I shared, you already have CMake installed.
Dont forget to change:
cmake_minimum_required(VERSION 3.10)
project(cpp-yolo-detection) # your folder name here
# Find OpenCV
set(OpenCV_DIR C:/Libraries/opencv/build) # path to opencv
find_package(OpenCV REQUIRED)
add_executable(object-detection object-detection.cpp) # your file name
# Link OpenCV libraries
target_link_libraries(object-detection ${OpenCV_LIBS})
Finally, this is the last step. I used code from this repository, but I modified some parts and added comments to help you understand it better.
#include <fstream>
#include <opencv2/opencv.hpp>
// Load labels from coco-classes.txt file
std::vector<std::string> load_class_list()
{
std::vector<std::string> class_list;
// change this txt file to your txt file that contains labels
std::ifstream ifs("C:/Users/sirom/Desktop/cpp-ultralytics/coco-classes.txt");
std::string line;
while (getline(ifs, line))
{
class_list.push_back(line);
}
return class_list;
}
// Model
void load_net(cv::dnn::Net &net)
{
// change this path to your model path
auto result = cv::dnn::readNet("C:/Users/sirom/Desktop/cpp-ultralytics/yolov5s.onnx");
std::cout << "Running on CPU/n";
result.setPreferableBackend(cv::dnn::DNN_BACKEND_OPENCV);
result.setPreferableTarget(cv::dnn::DNN_TARGET_CPU);
net = result;
}
const std::vector<cv::Scalar> colors = {cv::Scalar(255, 255, 0), cv::Scalar(0, 255, 0), cv::Scalar(0, 255, 255), cv::Scalar(255, 0, 0)};
// You can change this parameters to obtain better results
const float INPUT_WIDTH = 640.0;
const float INPUT_HEIGHT = 640.0;
const float SCORE_THRESHOLD = 0.5;
const float NMS_THRESHOLD = 0.5;
const float CONFIDENCE_THRESHOLD = 0.5;
struct Detection
{
int class_id;
float confidence;
cv::Rect box;
};
// yolov5 format
cv::Mat format_yolov5(const cv::Mat &source) {
int col = source.cols;
int row = source.rows;
int _max = MAX(col, row);
cv::Mat result = cv::Mat::zeros(_max, _max, CV_8UC3);
source.copyTo(result(cv::Rect(0, 0, col, row)));
return result;
}
// Detection function
void detect(cv::Mat &image, cv::dnn::Net &net, std::vector<Detection> &output, const std::vector<std::string> &className) {
cv::Mat blob;
// Format the input image to fit the model input requirements
auto input_image = format_yolov5(image);
// Convert the image into a blob and set it as input to the network
cv::dnn::blobFromImage(input_image, blob, 1./255., cv::Size(INPUT_WIDTH, INPUT_HEIGHT), cv::Scalar(), true, false);
net.setInput(blob);
std::vector<cv::Mat> outputs;
net.forward(outputs, net.getUnconnectedOutLayersNames());
// Scaling factors to map the bounding boxes back to original image size
float x_factor = input_image.cols / INPUT_WIDTH;
float y_factor = input_image.rows / INPUT_HEIGHT;
float *data = (float *)outputs[0].data;
const int dimensions = 85;
const int rows = 25200;
std::vector<int> class_ids; // Stores class IDs of detections
std::vector<float> confidences; // Stores confidence scores of detections
std::vector<cv::Rect> boxes; // Stores bounding boxes
// Loop through all the rows to process predictions
for (int i = 0; i < rows; ++i) {
// Get the confidence of the current detection
float confidence = data[4];
// Process only detections with confidence above the threshold
if (confidence >= CONFIDENCE_THRESHOLD) {
// Get class scores and find the class with the highest score
float * classes_scores = data + 5;
cv::Mat scores(1, className.size(), CV_32FC1, classes_scores);
cv::Point class_id;
double max_class_score;
minMaxLoc(scores, 0, &max_class_score, 0, &class_id);
// If the class score is above the threshold, store the detection
if (max_class_score > SCORE_THRESHOLD) {
confidences.push_back(confidence);
class_ids.push_back(class_id.x);
// Calculate the bounding box coordinates
float x = data[0];
float y = data[1];
float w = data[2];
float h = data[3];
int left = int((x - 0.5 * w) * x_factor);
int top = int((y - 0.5 * h) * y_factor);
int width = int(w * x_factor);
int height = int(h * y_factor);
boxes.push_back(cv::Rect(left, top, width, height));
}
}
data += 85;
}
// Apply Non-Maximum Suppression
std::vector<int> nms_result;
cv::dnn::NMSBoxes(boxes, confidences, SCORE_THRESHOLD, NMS_THRESHOLD, nms_result);
// Draw the NMS filtered boxes and push results to output
for (int i = 0; i < nms_result.size(); i++) {
int idx = nms_result[i];
// Only push the filtered detections
Detection result;
result.class_id = class_ids[idx];
result.confidence = confidences[idx];
result.box = boxes[idx];
output.push_back(result);
// Draw the final NMS bounding box and label
cv::rectangle(image, boxes[idx], cv::Scalar(0, 255, 0), 3);
std::string label = className[class_ids[idx]];
cv::putText(image, label, cv::Point(boxes[idx].x, boxes[idx].y - 5), cv::FONT_HERSHEY_SIMPLEX, 2, cv::Scalar(255, 255, 255), 2);
}
}
int main(int argc, char **argv)
{
// Load class list
std::vector<std::string> class_list = load_class_list();
// Load input image
std::string image_path = cv::samples::findFile("C:/Users/sirom/Desktop/cpp-ultralytics/test2.jpg");
cv::Mat frame = cv::imread(image_path, cv::IMREAD_COLOR);
// Load the modeL
cv::dnn::Net net;
load_net(net);
// Vector to store detection results
std::vector<Detection> output;
// Run detection on the input image
detect(frame, net, output, class_list);
// Save the result to a file
cv::imwrite("C:/Users/sirom/Desktop/cpp-ultralytics/result.jpg", frame);
while (true)
{
// display image
cv::imshow("image",frame);
// Exit the loop if any key is pressed
if (cv::waitKey(1) != -1)
{
std::cout << "finished by user\n";
break;
}
}
std::cout << "Processing complete. Image saved /n";
return 0;
}
mkdir build
cd build
cmake ..
cmake --build .
.\Debug\object-detection.exe
