https://github.com/arasgungore/pcl-segmentation-and-tracking

Final project titled "Point Cloud Segmentation and Object Tracking using RGB-D Data" for the Machine Vision (EE 576) course.
https://github.com/arasgungore/pcl-segmentation-and-tracking
computer-vision computer-vision-algorithms computer-vision-application computer-vision-opencv machine-vision machine-vision-camera object-tracking opencv pcl pcl-library point-cloud point-cloud-library point-cloud-segmentation point-cloud-visualization pointcloud region-growing region-growing-segmentation rgb-d-data rgb-depth-image segmentation
Last synced: 19 days ago
JSON representation
Final project titled "Point Cloud Segmentation and Object Tracking using RGB-D Data" for the Machine Vision (EE 576) course.
Host: GitHub
URL: https://github.com/arasgungore/pcl-segmentation-and-tracking
Owner: arasgungore
License: mit
Created: 2023-11-30T11:32:40.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2023-11-30T12:14:14.000Z (over 1 year ago)
Last Synced: 2025-03-26T12:46:49.222Z (about 1 month ago)
Topics: computer-vision, computer-vision-algorithms, computer-vision-application, computer-vision-opencv, machine-vision, machine-vision-camera, object-tracking, opencv, pcl, pcl-library, point-cloud, point-cloud-library, point-cloud-segmentation, point-cloud-visualization, pointcloud, region-growing, region-growing-segmentation, rgb-d-data, rgb-depth-image, segmentation
Language: C++
Homepage:
Size: 2.27 MB
Stars: 6
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project

README

        # PCL-segmentation-and-tracking

Final project titled "Point Cloud Segmentation and Object Tracking using RGB-D Data" for the Machine Vision (EE 576) course.

## Abstract

This project focuses on point cloud segmentation

and object tracking using RGB-D data. The goal is to apply 3D

segmentation on each frame of the input data, identify objects

with horizontally flat faces, track the movement of objects across

frames, and compute information regarding segments to evaluate

the tracking performance. The project is implemented in C++

using the Point Cloud Library (PCL) and OpenCV libraries.

This report provides a detailed explanation of the algorithms

and steps involved in the code implementation.

Index Terms—Point Cloud, Segmentation, Object Tracking,

RGB-D Data, Region Growing, Flat Face Detection.

## I. INTRODUCTION

The segmentation of point cloud data is a fundamental task

in computer vision and robotics. It involves partitioning the

point cloud into meaningful segments corresponding to objects

or surfaces of interest. Object tracking aims to follow the

movement of objects across frames in a sequence of data. This

project combines these two tasks using RGB-D data, which

provides color and depth information for each point in the

cloud.

In this project, we present an implementation of point cloud

segmentation and object tracking using RGB-D data. The

input data consists of RGB images and corresponding depth

maps. We convert the RGB and depth information into a 3D

point cloud representation. We then perform 3D segmentation

on each frame to identify objects and surfaces in the scene.

Additionally, we track the movement of objects across frames

and compute information regarding segments to evaluate the

tracking performance.

The project is implemented in C++ using the Point Cloud

Library (PCL) and OpenCV libraries. These libraries provide

efficient algorithms and tools for processing and analyzing

point cloud data. The following sections describe the method-

ology and algorithms used in the implementation.

## II. METHODOLOGY

The project implementation follows the following steps:

### A. RGB-D Data Conversion

The RGB-D data conversion process is a crucial step

in preparing the input for point cloud segmentation and

object tracking. In this project, we implemented the

‘rgbdepthtopoint cloud‘ function to convert RGB and

depth images into a point cloud representation.

The function takes two input parameters: the file paths of

the RGB and depth images. It first reads the RGB image

using the ‘cv::imread‘ function and checks if the image data

is successfully loaded. Similarly, it reads the depth image as

a 16-bit grayscale image using ‘cv::imread‘ with the ‘IM-

READANYDEPTH‘ flag. Again, it verifies the successful

loading of the depth image data.

To ensure consistency between the RGB and depth images,

the function resizes the RGB image to match the size of the

depth image using ‘cv::resize‘. This step is necessary because

the intrinsic camera parameters used for the conversion are

defined based on the size of the depth image.

Now, with the RGB and depth images prepared, the func-

tion creates a ‘pcl::PointCloud¡pcl::PointXYZRGB¿‘ object,

representing the point cloud. It initializes the cloud pointer

using ‘pcl::PointCloud¡pcl::PointXYZRGB¿::Ptr cloud(new

pcl::PointCloud¡pcl::PointXYZRGB¿)‘.

The camera intrinsic parameters, including the focal lengths

(‘fx‘ and ‘fy‘) and the principal point coordinates (‘cx‘ and

‘cy‘), are then defined. These parameters are specific to the

camera used to capture the RGB-D data and can be adjusted

accordingly.

To convert the pixel coordinates into 3D world coordinates,

the function iterates through each pixel in the depth image. It

retrieves the depth value at each pixel, divides it by 1000 to

convert it from millimeters to meters, and checks if the depth

value falls within a defined range (e.g., between ‘mindepth‘

and ‘maxdepth‘, which can be adjusted based on the specific

scene and requirements).

For each valid depth value, a new ‘pcl::PointXYZRGB‘

point is created. The X, Y, and Z coordinates of the point

are calculated using the pixel coordinates, camera intrinsic

parameters, and the depth value. The RGB color value is

obtained from the corresponding RGB image pixel at the same

location. The RGB color is packed into a 32-bit integer using

bitwise operations and stored in the ‘rgb‘ field of the point.

Finally, the point is added to the point cloud using ‘cloud-

¿pushback(point)‘.

The function continues iterating through all pixels in the

depth image, converting each valid pixel into a point in the

point cloud. Once all pixels have been processed, the function

returns the resulting point cloud.

This RGB-D data conversion step ensures that the RGB

and depth information is combined accurately into a 3D

point cloud representation, enabling further processing such

as segmentation and object tracking based on the combined

data.

The point cloud generated from the RGB-D data conversion

can be visualized using appropriate visualization tools to verify

its accuracy and align it with the original RGB and depth

images.

### B. Segmentation using Region Growing

After converting the RGB-D data into a point cloud, we

perform segmentation to identify objects and surfaces in the

scene. We use the ’pcl::RegionGrowingRGB’ class from the

Point Cloud Library (PCL) in our custom ’rgbsegmentation’

function for this task. The region growing algorithm considers

both RGB and spatial proximity to group points into segments.

The ’rgbsegmentation’ function takes as input the original

point cloud (‘cloud‘) and outputs the segmented point cloud

(‘coloredcloud‘) as well as a vector of point indices repre-

senting the individual clusters (‘clusters‘).

To perform segmentation, the function first creates a search

tree using the ’pcl::search::KdTree’ class. This search tree

allows efficient neighborhood search for region growing.

An instance of the ’pcl::RegionGrowingRGB’ class, named

‘reg‘, is then created. The input cloud is set using the ’setInput-

Cloud’ function, and the search method is set to the created

search tree using ’setSearchMethod’.

Several parameters of the region growing algorithm are

configured to control the segmentation process:

- setDistanceThresholdsets the threshold for the

    maximum allowed distance between neighboring points

    to be considered part of the same region. Adjusting this

    value affects the smoothness of the resulting segmenta-

    tion.

- setPointColorThreshold defines the maximum

    allowed color difference between neighboring points to

    be considered part of the same region. This parameter

    controls the color similarity criteria for region growing.

- setRegionColorThresholdsets the color thresh-

    old between regions. If the color difference between

    two adjacent regions exceeds this threshold, they are

    considered separate segments.

- setMinClusterSizedefines the minimum number

    of points required for a cluster to be considered valid.

    Smaller clusters are discarded.

  

Once the parameters are set, the ’extract’ function is called

on the ’reg’ object to perform the region growing segmenta-

tion. The resulting clusters are stored in the ’clusters’ vector of

’pcl::PointIndices’, where each element represents the indices

of points belonging to a specific cluster.

The segmented point cloud, with each cluster colored

uniquely, can be obtained using the ’getColoredCloud’ func-

tion of the ’reg’ object. The colored point cloud is assigned

to the ’coloredcloud’ output parameter.

By performing region growing segmentation, the function

identifies distinct objects and surfaces in the scene based

on color and spatial proximity. The resulting clusters can be

further analyzed or used for object tracking and recognition

tasks.

### C. Finding Horizontally Flat Faces

In addition to segmenting the point cloud, we also want

to identify objects with horizontally flat faces in the scene.

Examples of horizontally flat faces include the ground, tables,

and beds. To achieve this, we implement a custom function

called ‘findhorizontalplanes‘.

The function ‘findhorizontal planes‘ aims to identify clus-

ters that represent horizontally flat faces in the scene. It

takes the original point cloud (‘cloud‘), the point cloud of

surface normals (‘cloudnormals‘), and the segmented clusters

(‘clusters‘) as inputs. The function populates the ‘horizon-

talplaneclusters‘ vector with the indices of the clusters that

have horizontally flat faces.

To determine if a cluster represents a horizontally flat face,

the function calculates the average surface normal for each

cluster. This is done by iterating over the indices of points

in each cluster and accumulating their corresponding surface

normals. The average normal is then computed by dividing

the accumulated normal components by the number of points

in the cluster.

Next, the function checks if the absolute value of the x-

component of the average normal is less than a threshold

(‘dotproduct thresholdx‘). If the x-component is below the

threshold, it indicates that the average normal is close to the

vertical axis, suggesting a horizontally flat face.

If a cluster is determined to have a horizontally flat face,

its indices are added to the ‘horizontalplaneclusters‘ vector.

The resulting ‘horizontalplaneclusters‘ can be used to

further analyze or visualize the clusters representing flat faces

separately from other objects and surfaces in the scene.

It’s important to note that the threshold value

(‘dotproduct thresholdx‘) used in the code snippet

may need adjustment depending on the specific characteristics

of the scene and the desired sensitivity in detecting flat

faces. Fine-tuning this threshold may be necessary to obtain

accurate and meaningful results.

To visualize the segmented clusters, the ‘visualizeclusters‘

function is provided. It takes the original point cloud (‘cloud‘),

the segmented clusters (‘clusters‘), the previous frame’s cluster

centers (‘previouscenters‘), and the current frame’s cluster

centers (‘currentcenters‘) as inputs.

The function creates a PCLVisualizer object and adds the

original point cloud to it. It also registers a keyboard callback

to handle user interaction.

For each cluster in the ‘clusters‘ vector, the function creates

a new point cloud containing only the points belonging to

that cluster. The cluster is assigned a unique color using

the ‘pcl::visualization::PointCloudColorHandlerCustom‘ class,

and it is added to the viewer.

Spheres are added to represent the current center of each

cluster using the corresponding coordinates from the ‘cur-

rentcenters‘ vector. If a previous frame is available and

‘previouscenters‘ is not empty, spheres and a line are added

to connect the previous and current centers.

Bounding boxes are computed for each cluster using the

minimum and maximum coordinates of its points. These

bounding boxes are added to the viewer as semi-transparent

cubes, outlining the extents of each cluster.

A legend is included in the viewer to explain the color

coding and shapes used in the visualization. By using the

‘visualizeclusters‘ function, we can visually inspect the seg-

mented clusters and their spatial relationships, aiding in the

understanding and analysis of the scene.

### D. Object Tracking

To track the movement of objects across frames, we im-

plement a simple object tracking algorithm. In each frame,

we compare the centers of the detected objects with the

previously tracked objects’ centers. If a match is found within

a specified distance threshold, we update the tracked object’s

center. Otherwise, we add a new object to the tracked objects

list. This process is computed in a custom function called

‘visualizeclusters‘.

The ‘visualizeclusters‘ function is responsible for visual-

izing the clustered point cloud data. It uses the PCLVisualizer

class from the Point Cloud Library (PCL) to create a 3D

viewer window and display the point cloud.

The function takes several inputs: - ‘cloud‘: A

pointer to the original point cloud data of type

‘pcl::PointCloud¡pcl::PointXYZRGB¿::Ptr‘. - ‘clusters‘:

A vector of ‘pcl::PointIndices‘ representing the indices of

points belonging to each cluster. - ‘previouscenters‘: A

vector of ‘pcl::PointXYZ‘ representing the previous centers

of the tracked objects. - ‘currentcenters‘: A vector of

‘pcl::PointXYZ‘ representing the current centers of the

tracked objects.

Inside the function, a new PCLVisualizer object is created,

and the original point cloud is added to it. The background

color of the viewer is set to black. The RGB color handler is

used to visualize the original point cloud.

The function then iterates over each cluster in the ‘clusters‘

vector. For each cluster, a new point cloud object is created,

containing only the points belonging to that cluster. This point

cloud is colorized with a randomly generated color and added

to the viewer. Additionally, a sphere is added to represent the

current center of the tracked object.

If there are previous centers available, a sphere is added

to represent the previous center. If the current and previous

centers exist, a line (cylinder) is added to connect them,

creating a visual representation of the movement trajectory.

A bounding box is calculated for each cluster by finding the

minimum and maximum x, y, and z coordinates of the points.

A semi-transparent cube is added to the viewer to represent

the bounding box of each cluster. A legend is added to the

viewer to explain the color coding and symbols used in the

visualization.

The ‘calculateClusterCenter‘ function calculates the center

coordinates of a cluster by averaging the x, y, and z coordinates

of its constituent points. The ‘calculateDistance‘ function

calculates the Euclidean distance between two points using

their x, y, and z coordinates.

The object tracking algorithm tracks the movement of

objects across frames by comparing the centers of the detected

objects with the previously tracked objects’ centers. If the

distance between a detected object’s center and a tracked

object’s center is within a specified threshold, the tracked

object’s center is updated. Otherwise, a new object is added

to the tracked objects list.

The tracked objects are represented by their center coordi-

nates (‘pcl::PointXYZ‘) and an occurrence count, which keeps

track of how many times the object has been detected across

frames. This occurrence count can be used to analyze the

frequency of object appearances.

By tracking the objects’ centers over time and visualizing

their trajectories, you can study their movement patterns and

analyze their behavior in the scene.

## III. RESULTS

The implemented code successfully performs point cloud

segmentation, identifies horizontally flat faces, tracks objects

across frames. The visualization provides a clear understand-

ing of the segmentation results, flat face detection, and object

tracking.

### A. Segmentation Results

The region growing segmentation algorithm effectively di-

vides the point cloud into distinct segments based on color and

spatial proximity. The resulting segmented clusters represent

different objects or surfaces in the scene. The colored cloud

visualization enables a visual assessment of the segmentation

quality, with each segment assigned a unique color.

Figure 1 shows the resulting segmented point cloud data.

Each color represents a different segment, indicating the

identified objects or surfaces. Because of the selected images,

the segmentation task has some difficulties. The image is

not be separated by using RGB values easily. The distance

threshold is a useful tool but to be able to separate horizontally

flat surfaces, the distance threshold is insufficient. Overseg-

menting the image is a solution to get every flat faces. The

segmentation algorithm successfully separates the objects from

the background and distinguishes different objects within the

scene.

### B. Horizontally Flat Face Detection

The custom ’findflatfaces’ function successfully identifies

objects with horizontally flat faces in the segmented clusters.

By analyzing the surface normals of the points within each

cluster, the function determines if a cluster has a horizontally

flat face. The resulting point cloud of flat faces provides

insights into the layout and structure of the scene.

Figure 2 shows the resulting point cloud with horizontally

flat faces highlighted. These flat faces correspond to objects

such as the ground, tables, and stools. By detecting and

visualizing the flat faces separately, identifying and analyzing

the structures and objects with horizontal surfaces becomes

easy. Because of the vulnerability and sensitivity of the normal

function to get flat faces, finding a balance between precision

and recall values is critical. Figure 2 shows a high precision

but low recall example.

### C. Object Tracking Performance

The implemented object tracking algorithm accurately

tracks the movement of objects across frames. By comparing

the centers of the detected objects in each frame with the

previously tracked objects’ centers, the algorithm updates the

object positions and keeps track of their occurrences. The

tracked objects’ trajectories can be visualized, allowing for

the analysis of object movements and patterns.

Figure 3 shows the object tracking results. The green

spheres represent the tracked objects’ centers in each frame,

while the red circles represent the previous center of the object.

The tracking algorithm follows the object based on its center

and the distance between the segment centers of two frames.

Also, bounding boxes covers the selected clusters. The size

of these boxes can be even better object-tracking feature than

their centers. Unfortunately, we had not enough time to apply

it. The program also creates a grey line between old and new

centers, to show the movement of the centers.

## IV. CONCLUSION

In this project, we have presented an implementation of

point cloud segmentation and object tracking using RGB-D

data. The code utilizes the Point Cloud Library (PCL) and

OpenCV libraries to perform efficient processing and visual-

ization. The implemented algorithms successfully segment the

point cloud, identify objects with horizontally flat faces, track

object movements across frames.

The segmentation results provide a clear understanding of

the objects and surfaces in the scene. The detection of hori-

zontally flat faces helps in identifying specific structures and

objects of interest. The object tracking algorithm accurately

tracks the movement of objects, allowing for analysis of their

trajectories and behavior.

In conclusion, this project demonstrates the effectiveness of

point cloud segmentation and object tracking using RGB-D

data. The implemented code can be used in various applica-

tions, such as robotics, augmented reality, and autonomous

navigation.

## V. FUTUREWORK

Several aspects can be further improved in this project.

First, more sophisticated segmentation algorithms can be ex-

plored to enhance the segmentation accuracy. For example,

deep learning-based approaches or advanced region growing

techniques can be investigated. These methods can potentially

handle more complex scenes and improve the quality of the

segmented clusters.

Second, robust object tracking methods can be employed to

improve the tracking performance. Techniques such as Kalman

filters, particle filters, or deep learning-based trackers can be

implemented to handle occlusions, scale changes, and other

challenges in object tracking. These methods can enhance the

accuracy and robustness of the tracking system, enabling better

object tracking across frames.

Furthermore, the code can be optimized for real-time pro-

cessing and applied to larger datasets for more extensive

evaluations. Efforts can be made to improve the efficiency

of the algorithms, such as implementing parallel processing

or GPU acceleration. Additionally, the code can be tested on

larger datasets with more diverse scenes to assess its scalability

and generalization capabilities.

In conclusion, there is ample room for future enhancements

and research in the field of point cloud segmentation and

object tracking. The combination of RGB-D data and advanced

algorithms opens up new possibilities for understanding and

interacting with the 3D world.

## VI. REFERENCES

- [1] Rusu, R. B., and Cousins, S. (2011). 3D is here: Point

Cloud Library (PCL). In 2011 IEEE International Conference

on Robotics and Automation (pp. 1-4). IEEE.

- [2] Bradski, G., and Kaehler, A. (2008). Learning OpenCV:

Computer Vision with the OpenCV Library. O’Reilly Media.

## Authors

👤 **Aras Güngöre**

* LinkedIn: [@arasgungore](https://www.linkedin.com/in/arasgungore)

* GitHub: [@arasgungore](https://github.com/arasgungore)

👤 **Arif Yılmaz**

* LinkedIn: [@arif1yilmaz](https://www.linkedin.com/in/arif1yilmaz)

* GitHub: [@arfylmaz](https://github.com/arfylmaz)
ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/arasgungore/pcl-segmentation-and-tracking

Awesome Lists containing this project

README