Adjusting Object Orientation with Perspective Transformations Using OpenCV

→ Aligning images by applying perspective transformation with OpenCV in Python.

When dealing with computer vision tasks, most of the time, images are not exactly ready to use. For example, lets say you want to use OCR algorithms to extract text from book pages, but the images of the pages have different angles; some are horizontal, and some have varying rotations. You need to align the images perfectly for the best results.

In this article, I will show you how to align images using Perspective Transformations with OpenCV in Python.

Aligning Book Pages using Perspective Transformation with OpenCV in Python

Example Usage of Perspective Transformations: Aligning Chess Board

Nowadays, I am working on a chess project, and I am trying to generate FEN (a digital format for representing a chessboard on platforms like lichess.com and chess.com) from an image. I used different approaches for extracting the locations of squares and board, and one of them is Perspective Transformations. Look at the image below, by using perspective transformations chessboard is aligned perfectly.

Aligning chessboard with Perspective Transformations in Python, github link

Actually, I have different approaches for this task, and if you are interested, you can check the GitHub repository from here.

How to apply Perspective Transformations to an image with OpenCV?

By using OpenCV, applying a perspective transformation to an object within an image is not that hard. You just need to define the coordinates of the object.

You need to choose 4 coordinates: top-left, bottom-left, top-right, and bottom-right. You can choose the points manually by looking at the image or write a simple script that shows your mouse coordinates within the image.

There are two functions for using Perspective Transformations in OpenCV:

  • cv2.getPerspectiveTransform: This function computes the perspective transformation matrix M.
  • cv2.warpPerspective: This function applies the perspective transformation matrix M to an image.

For parameters and more information, you can check the official OpenCV documentation.

Now, let’s implement what we learned in code with Python.

Read the image with OpenCV

Dont forget to change image path.

image = cv2.imread(r"images/chess1.jpeg")
rgb_image=cv2.cvtColor(image,cv2.COLOR_BGR2RGB)

plt.imshow(rgb_image)
Aligning a Book Page Using Perspective Transformation with OpenCV in Python

Define 4 Points Around the Page

pt1=[520,300] # top-left
pt2=[300,1100] # bottom-left
pt3= [1100,440] # top-right
pt4=[870,1250] # bottom-right

Don’t forget, the sequence of points matters.

Find the Max Height and Max Width for the Selected Part

# calculating the distance between points ( Pythagorean theorem ) 
height_1 = np.sqrt(((pt1[0] - pt2[0]) ** 2) + ((pt1[1] - pt2[1]) ** 2))
height_2 = np.sqrt(((pt3[0] - pt4[0]) ** 2) + ((pt3[1] - pt4[1]) ** 2))

width_1 = np.sqrt(((pt1[0] - pt3[0]) ** 2) + ((pt1[1] - pt3[1]) ** 2))
width_2 = np.sqrt(((pt2[0] - pt4[0]) ** 2) + ((pt2[1] - pt4[1]) ** 2))

max_height=max(int(height_1), int(height_2))
max_width = max(int(width_1), int(width_2))

print(max_height,max_width) #  --> 842 596 in my case

Apply Perspective Transformation

We have 2 functions:

→ cv2.getPerspectiveTransform(src, dst)

→ cv2.warpPerspective(src, dst, dsize)

# four input point 
input_pts=np.float32([pt1,pt2,pt3,pt4])

# output points for new transformed image
output_pts = np.float32([[0, 0],
                        [0, max_width],
                        [max_height , 0],
                        [max_height , max_width]])


# Compute the perspective transform M
M = cv2.getPerspectiveTransform(input_pts,output_pts)

out = cv2.warpPerspective(rgb_image,M,(max_height, max_width),flags=cv2.INTER_LINEAR)

plt.imshow(out)
Image Aligned by Applying Perspective Transformation with OpenCV in Python

Look at the image orientation, it is ready to use for different applications.

Aligning Book Pages by Applying Perspective Transformation with OpenCV in Python