Text Document Alignment using Probabilistic Houghline Transform
There are different approaches to aligning the text document images. In this tutorial, I will explain how the probabilistic Hough line Transform can be used to align document images by calculating the slope of the line detected by the transform. This is our original image which is not aligned properly.
Following pre-processing steps are done before feeding the image to the probabilistic hough line transform.
- Converting to grayscale
- blurring to remove noise
- thresholding followed by erosion and dilution
Before deep-diving into the image pre-processing, let's import the libraries first.
import cv2
import math
import numpy as np
from scipy import ndimage
Now, load and pre-process the image.
# Read image from directory
image = cv2.imread('image.jpg')
img_copy = image.copy()
height,width = image.shape[:2]
center = (width//2, height//2)original_image = image.copy()# converting image to grayscale
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)# blur image
blur = cv2.GaussianBlur(gray, (5, 5), 0)# threshold image
_, threshed = cv2.threshold(blur,0,255,cv2.THRESH_BINARY_INV+
cv2.THRESH_OTSU)# erode image
eroded = cv2.erode(threshed,(3,3),1)# dilate image
dilate = cv2.dilate(eroded, (35, 35), iterations=3)
The probabilistic hough line transform of the image is computed as:
lines = cv2.HoughLinesP(dilate,1,np.pi/180,200,None,150,10)
It detects lines from the given document image. Now, our task is to find the slope of the line and finally to get the rotation angle.
if lines is not None:
horizontal_lines = []
for i,line in enumerate(lines):
x1 = line[0][0]
y1 = line[0][1]
x2 = line[0][2]
y2 = line[0][3]
cv2.line(img_copy, (x1,y1), (x2,y2), (0, 0, 255), 1, cv2.LINE_AA)
diff_x = x2-x1
diff_y = y2-y1
if abs(diff_y) < 30 and abs(diff_x) != 0:
horizontal_lines.append((x1, y1, x2, y2))
try:
slope = diff_y / diff_x
angle = math.degrees(math.atan(slope))
angles.append(angle)
except Exception as e:
print(e)
continue
cv2.imshow('All Extracted Lines', img_copy)
for line in horizontal_lines:
# print(line)
cv2.line(img_copy_1, (line[0],line[1]), (line[2],line[3]), (0,0,255), 1, cv2.LINE_AA)
cv2.imshow('Formated lines',img_copy_1)
rotation_angle = sum(angles) / len(angles)
It detects the line from the images. I have drawn the line over the original image using the OpenCV cv2.line() method.
We have calculated the rotation angle. Now it’s time to rotate the image.
img_rotated = ndimage.rotate(image, rotation_angle, reshape=True)
img_rotated = img_rotated.astype(np.uint8)
The final rotated image is:
There is an alternative way of image rotation which I have mostly used.WrapAffine transform preserves collinearity, parallelism as well as the ratio of distances between the points. We can do it to perform rotation, translation and scaling to the image.
rotation_matrix = cv2.getRotationMatrix2D(center, rotation_angle, 1)
# rotate orignal image to show transformation
rotated_image = cv2.warpAffine(original_image, rotation_matrix, (width, height), borderValue=(255, 255, 255))
This is a final rotated image using WrapAffine Transform.
Thank you! I hope that will help you to align the text document images properly.