https://github.com/motapinto/computer-vision-structured-light
Project developed for the Computer Vision course unit in FEUP
https://github.com/motapinto/computer-vision-structured-light
computer-vision edge-detection jupyter-notebook opencv python structured-light
Last synced: 8 months ago
JSON representation
Project developed for the Computer Vision course unit in FEUP
- Host: GitHub
- URL: https://github.com/motapinto/computer-vision-structured-light
- Owner: motapinto
- Created: 2021-03-17T15:54:47.000Z (over 5 years ago)
- Default Branch: main
- Last Pushed: 2021-04-27T14:05:31.000Z (about 5 years ago)
- Last Synced: 2025-06-04T04:13:10.742Z (about 1 year ago)
- Topics: computer-vision, edge-detection, jupyter-notebook, opencv, python, structured-light
- Language: Jupyter Notebook
- Homepage:
- Size: 3.44 MB
- Stars: 10
- Watchers: 1
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Calculate X, Y, Z Real World Coordinates from Image Coordinates using OpenCV and Structured Light
### Notebook by [André Madureira](https://github.com/Andremad-03), [José Guerra](https://github.com/LockDownPT), [Luis Ramos](https://github.com/luispramos), [Martim Pinto da Silva](https://github.com/motapinto)
#### [Faculdade de Engenharia da Universidade do Porto](https://sigarra.up.pt/feup/en/web_page.inicial)
#### It is recommended to [view this notebook in nbviewer]() for the best overall experience
#### You can also execute the code on this notebook using [Jupyter Notebook](https://jupyter.org/), [Binder](https://mybinder.org/) or [Google Colab](https://colab.research.google.com/) (no local installation required)
## Table of contents
1. - [Introduction](#Introduction)
2. - [Required libraries](#Required-libraries)
3. - [Camera calibration](#Camera-calibration)
- [Intrinsic parameters](#Intrinsic-parameters)
- [Extrinsic parameters](#Extrinsic-parameters)
4. - [Re-projection Error](#Re-projection-Error)
5. - [Undistortion](#Undistortion)
6. - [Perspective Projection Matrix](#Perspective-Projection-Matrix)
7. - [Line Detection](#Line-Detection)
8. - [Resources](#Resources)
## Introduction
[go back to the top](#Table-of-contents)
Structured light techniques for 3D data acquisition play a central role
in many 3D data acquisition applications, namelly when the surfaces to
be measured do not have feature points or when it is necessary to obtain
dense 3D data. They are used in numerous applications: industrial (ex:
dimensional control or quality inspection), reverse engineering, urban
(ex: road inspection) and medical, are just a few examples.
These techniques are based on the acquisition of an image of a scene
over which a light pattern is projected; this pattern ranges from a
single light ray or a single light sheet to a set of parallel sheets or
a pseudo-random pattern. Frequently, laser light is used to simplify the
detection of the projected patterns.
In this work we will have the opportunity of implementing a 3D data
acquisition system based on structured light, using a single sheet of
light/shadow
## Required libraries
[go back to the top](#Table-of-contents)
The primary libraries that we'll be using are:
- numpy: Provides a fast numerical array structure and helper
functions.
- cv2: OpenCV provides a real-time optimized Computer Vision library,
tools, and hardware.
- glob: Is used to retrieve files/pathnames matching a specified
pattern.
- matplotlib: Basic plotting library in Python, with capabilities of
showing images.
- sympy: Library for symbolic mathematics and for solving equations.
``` python
import os
import numpy as np
import cv2
import glob
import math
import matplotlib.pyplot as plt
from sympy import symbols, Eq, solve, poly
```
``` python
# number of interior squares of chess Board
n_grid = 7
# chessboard square size in cm
square_size = 2.5
# termination criteria
criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 30, 0.001)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
objp = np.zeros((n_grid*n_grid,3), np.float32)
objp[:,:2] = np.mgrid[0:n_grid,0:n_grid].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
obj_points = [] # 3d point in real world space
img_points = [] # 2d points in image plane.
line_detection_path = 'images/lineDetection'
calibration_path = 'images/calibration'
calibration_images = glob.glob(os.path.join(calibration_path,'*.JPG'))
```
## Camera calibration
### Intrinsic parameters
Intrinsic parameters are specific to a camera. They include information
like focal length (fx,fy) and optical centers (cx,cy). The focal length
and optical centers can be used to create a camera matrix, which can be
used to remove distortion due to the lenses of a specific camera. The
camera matrix is unique to a specific camera, so once calculated, it can
be reused on other images taken by the same camera.
### Extrinsic parameters
Extrinsic parameters corresponds to rotation and translation vectors
which translates a coordinates of a 3D point to a coordinate system.
``` python
for fname in calibration_images:
img = cv2.imread(fname)
# TODO: Why do we put thin in gray? Does it affect results?
gray = cv2.cvtColor(img,cv2.COLOR_BGR2GRAY)
# Find the chess board corners
ret, corners = cv2.findChessboardCorners(img, (n_grid,n_grid), None)
# If found, add object points, image points (after refining them)
if ret:
obj_points.append(objp)
img_point = cv2.cornerSubPix(gray, corners, (11, 11), (-1,-1), criteria)
img_points.append(img_point)
# After acquiring the object and image points we need to calibrate the camera. For that we use the function, cv2.calibrateCamera() that returns the camera matrix, distortion coefficients, rotation and translation vectors.
_, camera_matrix, dist, rvecs, tvecs = cv2.calibrateCamera(obj_points, img_points, gray.shape[::-1], None, None)
```
## Re-projection Error
Re-projection error gives a good estimation of just how exact the found
parameters are. The closer the re-projection error is to zero, the more
accurate the parameters we found are.
``` python
total_error = 0
for i in range(len(obj_points)):
img_points_tes, _ = cv2.projectPoints(obj_points[i], rvecs[i], tvecs[i], camera_matrix, dist)
error = cv2.norm(src1=img_points[i], src2=img_points_tes, normType=cv2.NORM_L2) / len(img_points_tes)
total_error += error
print("mean error: {}".format(total_error / len(obj_points)))
```
## Undistortion
Now we can take an image and undistort it using the distortion
coeficients.
``` python
imgcal = cv2.imread(os.path.join('images', 'img_cal.jpg'))
h, w = imgcal.shape[:2]
new_camera_mtx, roi=cv2.getOptimalNewCameraMatrix(camera_matrix, dist, (w,h), 1, (w,h))
# undistort
undst = cv2.undistort(imgcal, camera_matrix, dist, None, new_camera_mtx)
# crop the image
x,y,w,h = roi
undst = undst[y:y+h, x:x+w]
cv2.imwrite('images/calibresult.png', undst)
```
## Perspective Projection Matrix
We create the Perspective Projection Matrix using the Camera Matrix
(obtained in the function calibrateCamera), the Rotation Matrix (which
we are going to calculate using Rodrigues function), and the Translation
Vector (later obtained in the SolvePnP function), where all these
functions are a part of the cv2 library
``` python
#The draw function implemented next draws the reference axis in our selected image
def draw(img, corners, imgpts):
corner = tuple(corners[0].ravel())
img = cv2.line(img, corner, tuple(imgpts[0].ravel()), (255,0,0), 5)
img = cv2.line(img, corner, tuple(imgpts[1].ravel()), (0,255,0), 5)
img = cv2.line(img, corner, tuple(imgpts[2].ravel()), (0,0,255), 5)
return img
```
``` python
axis = np.float32([[3,0,0], [0,3,0], [0,0,-3]]).reshape(-1,3)
# prepare object points, like (0,0,0), (1,0,0), (2,0,0) ....,(6,5,0)
#The np.zeros matrix was changed from (n_grid, n_grid) = (7,7) to (9,6)
#because the new chessboard image has 9*6 inners corners
objp2 = np.zeros((9*6,3), np.float32)
objp2[:,:2] = np.mgrid[0:9,0:6].T.reshape(-1,2)
# Arrays to store object points and image points from all the images.
obj_points2 = [] # 3d point in real world space
img_points2 = [] # 2d points in image plane.
gray = cv2.cvtColor(undst, cv2.COLOR_BGR2GRAY)
ret, corners = cv2.findChessboardCorners(gray, (9,6),None)
if ret:
corners2 = cv2.cornerSubPix(gray,corners,(11,11),(-1,-1),criteria)
# Find the rotation and translation vectors.
ret, rvecs1, tvecs1, inlier = cv2.solvePnPRansac(objp2, corners2, camera_matrix, dist)
# project 3D reference into image plane
imgpts, jac = cv2.projectPoints(axis, rvecs1, tvecs1, camera_matrix, dist)
ref_img = draw(undst, corners, imgpts)
plt.imshow(ref_img)
plt.show()
rotM, _= cv2.Rodrigues(rvecs1)
```
``` python
#Prespective Projection Matrix -
K = new_camera_mtx
#In this block, we homogenize the matrices that are going to be a part of the Prespective Projection Matrix (PPMatrix) because
# Using homogeneous coordinates, Rotation and Translation can be expressed by a single matrix
Homog_K = np.array([[K[0][0], K[0][1], K[0][2], 0], [K[1][0], K[1][1], K[1][2], 0], [K[2][0], K[2][1], K[2][2], 0]])
Homog_R = np.array([[rotM[0][0], rotM[0][1], rotM[0][2], 0],[rotM[1][0], rotM[1][1], rotM[1][2], 0], [rotM[2][0], rotM[2][1], rotM[2][2], 0],[0, 0, 0, 1]])
Homog_T = np.array([[1, 0, 0, tvecs1[0][0]],[0, 1, 0, tvecs1[1][0]], [0, 0, 1, tvecs1[2][0]], [0, 0, 0, 1]])
Homog_Ext = np.matmul(Homog_T, Homog_R)
PPMatrix = np.matmul(Homog_K, Homog_Ext)
```
## Line Detection
``` python
# Gray image
img_shadow_plane = cv2.imread(os.path.join(line_detection_path, 'img4.JPG'))
gray = cv2.cvtColor(img_shadow_plane, cv2.COLOR_BGR2GRAY)
gray = cv2.GaussianBlur(gray, (5,5), 0)
# Threshold
low_threshold = 80
high_threshold = 120
_,thresh = cv2.threshold(gray, low_threshold, high_threshold, cv2.THRESH_BINARY)
thresh = 255 - thresh
thresh = cv2.erode(thresh, kernel=(1, 1), iterations=2)
# Laplacian
laplacian = cv2.Laplacian(thresh, cv2.CV_8U)
plt.rcParams["figure.figsize"] = (20, 10)
plt.rcParams.update({'font.size': 20})
fig, ((ax1, ax2, ax3)) = plt.subplots(1, 3)
ax1.imshow(gray, cmap='gray')
ax1.set_xlabel('Gray Image')
ax2.imshow(thresh, cmap='gray')
ax2.set_xlabel('Threshold')
ax3.imshow(laplacian, cmap='gray')
ax3.set_xlabel('Laplacian')
plt.show()
```
``` python
# Then, use findContours to get the contours. You can adjust the parameters for better performance.
contours2, _ = cv2.findContours(laplacian, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cont2 = cv2.drawContours(img_shadow_plane, contours2, -1, (0,255,0), 2)
plt.imshow(cont2, cmap='gray')
plt.show()
```
Here, we defined some variables in order to filtrate some of the
overlapping lines obtained with findContours. Basically, what happens is
that we select 3 points on top of the shadow line: the highest left
point, the highest right point and the highest middle point.
Also, xl, yl, xr, yr, xm, ym are defined with the value -1 and width+1
because they are supposed to represent pixels, and by assigning the
value -1 and width + 1, there is no risk of actually missing a point
value because these values do not belong in the image.
In order to regularize the obtained pixels we assure that the (xm, ym)
pixel refers to the line in the upper plane of our image, (xl, yl) to
the left line on the lower plane and (xr, yr) to the right line on the
lower plane
``` python
width = img_shadow_plane.shape[1]
mid_point = math.ceil(width/2)
xl=width+1
yl=width+1
xr=-1
yr=width+1
xm=-1
ym=width+1
for contour in contours2:
for point in contour:
if point[0][0] <= xl:
if point[0][0] == xl:
if point[0][1] <= yl:
xl = point[0][0]
yl = point[0][1]
else:
xl = point[0][0]
yl = point[0][1]
if point[0][0] >= xr:
if point[0][0] == xr:
if point[0][1] <= yr:
xr = point[0][0]
yr = point[0][1]
else:
xr = point[0][0]
yr = point[0][1]
if (mid_point-5 < point[0][0] < mid_point+5) and point[0][1] < ym:
xm = point[0][0]
ym = point[0][1]
```
Now, it's only necessary to obtain the A, B, C, D and a, b, c, d
variables of each pixel plane in order to obtain the pixel values in our
world coordinate. For this, we only need to add the value of a selected
plane which, in our case, was z=10 for (x10, y10) and z=0 for the other
2 points.
``` python
#1º Pixel
#(xm,ym) it's in the plane z1=10
cv2.circle(img_shadow_plane, (xm,ym), 5, (0,0,255), -1)
A1 = PPMatrix[2][0]*xm - PPMatrix[0][0]
B1 = PPMatrix[2][1]*xm - PPMatrix[0][1]
C1 = PPMatrix[2][2]*xm - PPMatrix[0][2]
D1 = PPMatrix[0][3] - PPMatrix[2][3]*xm
a1 = PPMatrix[2][0]*ym - PPMatrix[1][0]
b1 = PPMatrix[2][1]*ym - PPMatrix[1][1]
c1 = PPMatrix[2][2]*ym - PPMatrix[1][2]
d1 = PPMatrix[1][3] - PPMatrix[2][3]*ym
z1=-10/square_size
x1,y1 = symbols('x1 y1')
eq1 = Eq(A1*x1 + B1*y1 + C1*z1 - D1, 0)
eq2 = Eq(a1*x1 + b1*y1 + c1*z1 - d1, 0)
sol1 = solve((eq1,eq2), (x1,y1))
#2º Pixel
#(xl,yl) it's in the plane z2=0
cv2.circle(img_shadow_plane, (xl,yl), 5, (0,0,255), -1)
A2 = PPMatrix[2][0]*xl - PPMatrix[0][0]
B2 = PPMatrix[2][1]*xl - PPMatrix[0][1]
C2 = PPMatrix[2][2]*xl - PPMatrix[0][2]
D2 = PPMatrix[0][3] - PPMatrix[2][3]*xl
a2 = PPMatrix[2][0]*yl - PPMatrix[1][0]
b2 = PPMatrix[2][1]*yl - PPMatrix[1][1]
c2 = PPMatrix[2][2]*yl - PPMatrix[1][2]
d2 = PPMatrix[1][3] - PPMatrix[2][3]*yl
z2=0
x2,y2 = symbols('x2 y2')
eq3 = Eq(A2*x2 + B2*y2 + C2*z2 - D2, 0)
eq4 = Eq(a2*x2 + b2*y2 + c2*z2 - d2, 0)
sol2 = solve((eq3,eq4), (x2,y2))
#3º Pixel
#(xr,yr) it's in the plane z2=0
cv2.circle(img_shadow_plane, (xr,yr), 5, (0,0,255), -1)
A3 = PPMatrix[2][0]*xr - PPMatrix[0][0]
B3 = PPMatrix[2][1]*xr - PPMatrix[0][1]
C3 = PPMatrix[2][2]*xr - PPMatrix[0][2]
D3 = PPMatrix[0][3] - PPMatrix[2][3]*xr
a3 = PPMatrix[2][0]*yr - PPMatrix[1][0]
b3 = PPMatrix[2][1]*yr - PPMatrix[1][1]
c3 = PPMatrix[2][2]*yr - PPMatrix[1][2]
d3 = PPMatrix[1][3] - PPMatrix[2][3]*yr
z3=0
x3,y3 = symbols('x3 y3')
eq5 = Eq(A3*x3 + B3*y3 + C3*z3 - D3, 0)
eq6 = Eq(a3*x3 + b3*y3 + c3*z3 - d3, 0)
sol3 = solve((eq5,eq6), (x3,y3))
plt.imshow(img_shadow_plane)
plt.show()
```
Now, we have 3 points: (x1, y1, z1); (x2, y2, z2); (x3, y3, z3); which
is the number of points we need to calculate the shadow plane\! This
means that we only need to do some equation solving to obtain the A, B
and C plane variables (to the variable D it is going to be attributed a
constant value of 1).
``` python
varA, varB, varC= symbols('A B C')
#D is considered = 1 to calculate the shadow plane
D = 1
eqABC1 = Eq(varA*sol1[x1] + varB*sol1[y1] + varC*z1, D)
eqABC2 = Eq(varA*sol2[x2] + varB*sol2[y2] + varC*z3, D)
eqABC3 = Eq(varA*sol3[x3] + varB*sol3[y3] + varC*z3, D)
solABC = solve((eqABC1, eqABC2, eqABC3), (varA, varB, varC))
A, B, C = solABC[varA], solABC[varB], solABC[varC]
print('(A, B, C, D)')
print('{}, {}, {}, {})'.format(A, B, C, D))
```
``` python
#This function calculates the 3D coordinates given the Perspective Projection Matrix (PPM), the plane coefficients (A, B, C, D) and the
#pixel point we want to convert (i, j)
def calculate3DPointCoords(PPM, i, j, A, B, C, D):
x_var, y_var, z_var = symbols('X Y Z')
calcX = Eq((PPM[2][0] * i - PPM[0][0]) * x_var + (PPM[2][1]*i - PPM[0][1]) * y_var + (PPM[2][2]*i - PPM[0][2]) * z_var - PPM[0][3] + PPM[2][3]*i, 0)
calcY = Eq((PPM[2][0] * j - PPM[1][0]) * x_var + (PPM[2][1]*j - PPM[1][1]) * y_var + (PPM[2][2]*j - PPM[1][2]) * z_var - PPM[1][3] + PPM[2][3]*j, 0)
calcZ = Eq(A * x_var + B * y_var + C * z_var - D, 0)
sol = solve((calcX, calcY, calcZ), (x_var, y_var, z_var))
return sol[x_var], sol[y_var], -sol[z_var]
```
Here, we get a new image with the same shadow plane and apply
findContours to obtain the line representation of that plane.
``` python
# Gray image
img_to_represent = cv2.imread(os.path.join(line_detection_path, 'img1.JPG'))
gray_rep = cv2.cvtColor(img_to_represent, cv2.COLOR_BGR2GRAY)
gray_rep = cv2.GaussianBlur(gray_rep, (5,5), 0)
# Threshold
low_threshold = 80
high_threshold = 120 #low_threshold * 3
_,thresh_rep = cv2.threshold(gray_rep, low_threshold, high_threshold, cv2.THRESH_BINARY)
thresh_rep = 255 - thresh_rep
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, ksize=(2, 2))
# Remove noise
thresh_rep = cv2.erode(thresh_rep, kernel=(1,1), iterations=2)
# Laplacian
laplacian_rep = cv2.Laplacian(thresh_rep, cv2.CV_8U)
contours, _ = cv2.findContours(laplacian_rep, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
cont = cv2.drawContours(img_to_represent, contours, -1, (0,255,0), 5)
plt.imshow(cont, cmap='gray')
plt.show()
```
After having found the contours of the new image, it is now possible to
calculate the height map of the object using the shadow plane
coefficients (A, B, C, D) found earlier.
``` python
# Find highest pixel in contours
height = laplacian_rep.shape[0]
maximum = np.ones(width, dtype=int)*height
for contour in contours:
for point in contour:
i = point[0][0]
j = point[0][1]
if maximum[i] > j:
maximum[i] = j
# Compress maximum array and calculate 3D points into heightMap
heightMap = []
curr_max = -1
for i, max_val in enumerate(maximum):
if max_val != curr_max and max_val != height:
heightMap.append(calculate3DPointCoords(PPMatrix, i, max_val, A, B, C, D))
curr_max = max_val
# Create a height histogram with calculated height map
histHeight = [0.0, 1.0] * len(heightMap)
histHeight = np.array(histHeight).reshape((len(heightMap), 2))
for i, point in enumerate(heightMap):
histHeight[i][0] = point[2] * square_size
histHeight[i][1] = point[1] * square_size
plt.plot(histHeight[:,1], histHeight[:,0])
plt.xlabel('width(y)')
plt.ylabel('height(z)')
plt.show()
```
References
-
-
-