{"id":13497613,"url":"https://github.com/kcg2015/Vehicle-Detection-and-Tracking","last_synced_at":"2025-03-28T22:31:54.000Z","repository":{"id":45705501,"uuid":"107626347","full_name":"kcg2015/Vehicle-Detection-and-Tracking","owner":"kcg2015","description":"Computer vision based vehicle detection and tracking using Tensorflow Object Detection API and Kalman-filtering","archived":false,"fork":false,"pushed_at":"2020-05-23T16:36:20.000Z","size":76214,"stargazers_count":537,"open_issues_count":15,"forks_count":191,"subscribers_count":19,"default_branch":"master","last_synced_at":"2024-10-31T14:36:33.473Z","etag":null,"topics":["bayesian-filter","bounding-boxes","computer-vision","detection","hungarian-algorithm","kalman-filtering","keras","linear-assignment-problem","mobilenet-ssd","object-detection","occlusion","single-shot-multibox-detector","tensorflow-object-detection-api","tracking"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kcg2015.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-10-20T03:11:27.000Z","updated_at":"2024-10-17T15:32:49.000Z","dependencies_parsed_at":"2022-08-27T19:41:54.981Z","dependency_job_id":null,"html_url":"https://github.com/kcg2015/Vehicle-Detection-and-Tracking","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kcg2015%2FVehicle-Detection-and-Tracking","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kcg2015%2FVehicle-Detection-and-Tracking/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kcg2015%2FVehicle-Detection-and-Tracking/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kcg2015%2FVehicle-Detection-and-Tracking/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kcg2015","download_url":"https://codeload.github.com/kcg2015/Vehicle-Detection-and-Tracking/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246110266,"owners_count":20725022,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["bayesian-filter","bounding-boxes","computer-vision","detection","hungarian-algorithm","kalman-filtering","keras","linear-assignment-problem","mobilenet-ssd","object-detection","occlusion","single-shot-multibox-detector","tensorflow-object-detection-api","tracking"],"created_at":"2024-07-31T20:00:34.656Z","updated_at":"2025-03-28T22:31:48.981Z","avatar_url":"https://github.com/kcg2015.png","language":"Python","funding_links":[],"categories":["Libraries"],"sub_categories":["Object Detection"],"readme":"# Vehicle Detection and Tracking\n\n\n## Overview\nThis repo illustrates the detection and tracking of multiple vehicles using a camera mounted inside a self-driving car.  The aim here is to provide developers, researchers, and engineers a simple framework to quickly iterate different detectors and tracking algorithms. In the process, I focus on simplicity and readability of the code. The detection and tracking pipeline is relatively staight forward. It first initializes a detector and a tracker. Next, detector localizes the vehicles in each video frame. The tracker is then updated with the detection results. Finally the tracking results are annotated and displayed in a video frame.\n\n## Key files in this repo\n  \n  \n  * detector.py -- implements  ```CarDetector``` class to output car detection results\n  * tracker.py  -- implements Kalman Filter-based prediction and update for tracking\n  * main.py -- implements the detection and tracking pipeline, including detection-track assignment and track management\n  * helpers.py -- helper functions\n  * ssd_mobilenet_v1_coco_11_06_2017/frozen_inference_graph.pb -- pre-trained mobilenet-coco model\n\n## Detection\nIn the pipeline, vehicle (car) detection takes a captured image as input and produces the bounding boxes as the output. We use TensorFlow Object Detection API, which is an open source framework built on top of TensorFlow to construct, train and deploy object detection models. The Object Detection API also comes with a collection of detection models pre-trained on the COCO dataset that are well suited for fast prototyping. Specifically, we use a lightweight model: ssd\\_mobilenet\\_v1\\_coco that is based on Single Shot Multibox Detection (SSD) framework with minimal modification. Though this is a general-purpose detection model (not specifically optimized for vehicle detection), we find this model achieves the balance between bounding box accuracy and inference time.\n\nThe detector is implemented in ```CarDetector``` class in detector.py. The output are the coordinates of the bounding boxes (in the format of [y\\_up, x\\_left, y\\_down, x\\_right] ) of all the detected vehicles.\n\nThe COCO dataset contains images of 90 classes, with the first 14 classes all related to transportation, including bicycle, car, and bus, etc. The ID for car is 3.\n\n```\ncategory_index={1: {'id': 1, 'name': u'person'},\n                        2: {'id': 2, 'name': u'bicycle'},\n                        3: {'id': 3, 'name': u'car'},\n                        4: {'id': 4, 'name': u'motorcycle'},\n                        5: {'id': 5, 'name': u'airplane'},\n                        6: {'id': 6, 'name': u'bus'},\n                        7: {'id': 7, 'name': u'train'},\n                        8: {'id': 8, 'name': u'truck'},\n                        9: {'id': 9, 'name': u'boat'},\n                        10: {'id': 10, 'name': u'traffic light'},\n                        11: {'id': 11, 'name': u'fire hydrant'},\n                        13: {'id': 13, 'name': u'stop sign'},\n                        14: {'id': 14, 'name': u'parking meter'}} \n```\nThe following code snippet implements the actual detection using TensorFlow API.\n\n```\n(boxes, scores, classes, num_detections) = self.sess.run(\n                  [self.boxes, self.scores, self.classes, self.num_detections],\n                  feed_dict={self.image_tensor: image_expanded})\n```    \nHere ```boxes```, ```scores```, and ```classes``` represent the bounding box, confidence level, and class name corresponding to each of the detection, respectively. Next, we select the detections that are cars and have a confidence greater than a threshold ( e.g., 0.3 in this case). \n```\nidx_vec = [i for i, v in enumerate(cls) if ((v==3) and (scores[i]\u003e0.3))]\n```\nTo detect all kinds of vehicles, we also include the indices for bus and truck.\n```\nidx_vec = [i for i, v in enumerate(cls) if (((v==3) or (v==6) or (v==8)) and (scores[i]\u003e0.3))]\n```\nTo further reduce possible false positives, we include thresholds for bounding box width, height, and height-to-width ratio.\n\n```\nif ((ratio \u003c 0.8) and (box_h\u003e20) and (box_w\u003e20)):\n    tmp_car_boxes.append(box)\n    print(box, ', confidence: ', scores[idx], 'ratio:', ratio)\nelse:\n     print('wrong ratio or wrong size, ', box, ', confidence: ', scores[idx], 'ratio:', ratio)\n```\n\n## Kalman Filter for Bounding Box Measurement\n\nWe use Kalman filter for tracking objects. Kalman filter has the following important features that tracking can benefit from:\n\n* Prediction of object's future location\n* Correction of the prediction based on new measurements\n* Reduction of noise introduced by inaccurate detections\n* Facilitating the process of association of multiple objects to their tracks\n\nKalman filter consists of two steps: prediction and update. The first step uses previous states to predict the current state. The second step uses the current measurement, such as detection bounding box location , to correct the state. The formula are provided in the following:\n\n### Kalman Filter Equations:\n#### Prediction phase: notations\n\u003cimg src=\"example_imgs/pred_notations.gif\" alt=\"Drawing\" style=\"width: 250px;\"/\u003e\n#### Prediction phase: equations\n\u003cimg src=\"example_imgs/KF_predict.gif\" alt=\"Drawing\" style=\"width: 125px;\"/\u003e\n#### Update phase: notations\n\u003cimg src=\"example_imgs/update_notations.gif\" alt=\"Drawing\" style=\"width: 250px;\"/\u003e\n#### Update phase: equations\n\u003cimg src=\"example_imgs/KF_update.gif\" alt=\"Drawing\" style=\"width: 200px;\"/\u003e\n\n### Kalman Filter Implementation\nIn this section, we describe the implementation of the Kalman filter in detail.\n\nThe state vector has eight elements as follows:\n```\n[up, up_dot, left, left_dot, down, down_dot, right, right_dot]\n```\nThat is,  we use the coordinates and their first-order derivatives of the up left corner and lower right corner of the bounding box.\n\nThe process matrix, assuming the constant velocity (thus no acceleration), is:\n\n```\nself.F = np.array([[1, self.dt, 0,  0,  0,  0,  0, 0],\n                    [0, 1,  0,  0,  0,  0,  0, 0],\n                    [0, 0,  1,  self.dt, 0,  0,  0, 0],\n                    [0, 0,  0,  1,  0,  0,  0, 0],\n                    [0, 0,  0,  0,  1,  self.dt, 0, 0],\n                    [0, 0,  0,  0,  0,  1,  0, 0],\n                    [0, 0,  0,  0,  0,  0,  1, self.dt],\n                    [0, 0,  0,  0,  0,  0,  0,  1]])\n```\nThe measurement matrix, given that the detector only outputs the coordindate (not velocity), is:\n\n```\nself.H = np.array([[1, 0, 0, 0, 0, 0, 0, 0],\n                   [0, 0, 1, 0, 0, 0, 0, 0],\n                   [0, 0, 0, 0, 1, 0, 0, 0], \n                   [0, 0, 0, 0, 0, 0, 1, 0]])\n```\nThe state, process, and measurement noises are :\n\n```\n # Initialize the state covariance\n self.L = 100.0\n self.P = np.diag(self.L*np.ones(8))\n        \n        \n # Initialize the process covariance\n self.Q_comp_mat = np.array([[self.dt**4/2., self.dt**3/2.],\n                                    [self.dt**3/2., self.dt**2]])\n self.Q = block_diag(self.Q_comp_mat, self.Q_comp_mat, \n                            self.Q_comp_mat, self.Q_comp_mat)\n        \n# Initialize the measurement covariance\nself.R_scaler = 1.0/16.0\nself.R_diag_array = self.R_ratio * np.array([self.L, self.L, self.L, self.L])\nself.R = np.diag(self.R_diag_array)\n```\nHere  ```self.R_scaler``` represents the \"magnitude\" of measurement noise relative to state noise. A low ```self.R_scaler``` indicates a more reliable measurement. The following figures visualize the impact of measurement noise to the Kalman filter process. The green bounding box represents the prediction (initial) state. The red bounding box represents the measurement.\nIf measurement noise is low, the updated state (aqua colored bounding box) is very close to the measurement (aqua bounding box completely overlaps over the red bounding box).\n\n\u003cimg src=\"example_imgs/low_meas_noise.png\" alt=\"Drawing\" style=\"width: 300px;\"/\u003e\n\nIn contrast, if measurement noise is high, the updated state is very close to the initial prediction (aqua bounding box completely overlaps over the green bounding box).\n\n\u003cimg src=\"example_imgs/high_meas_noise.png\" alt=\"Drawing\" style=\"width: 300px;\"/\u003e\n\n## Detection-to-Tracker Assignment\n\nThe module ```assign_detections_to_trackers(trackers, detections, iou_thrd = 0.3)``` takes from current list of trackers and new detections, output matched detections, unmatched trackers, unmatched detections.\n\n\u003cimg src=\"example_imgs/vehcle_detection_tracking.png\" alt=\"Drawing\" style=\"width: 300px;\"/\u003e\n\n### Linear Assignment and Hungarian (Munkres) algorithm\n\nIf there are multiple detections, we need to match (assign) each of them to a tracker. We use intersection over union (IOU) of a tracker bounding box and detection bounding box as a metric. We solve the maximizing the sum of IOU assignment problem using the Hungarian algorithm (also known as Munkres algorithm). The machine learning package scikit-learn has a build-in utility function that implements the Hungarian algorithm.\n\n```\nmatched_idx = linear_assignment(-IOU_mat)   \n```\nNote that ```linear_assignment ``` by default minimizes an objective function. So we need to reverse the sign of ```IOU_mat``` for maximization.\n\n### Unmatched detections and trackers\n\nBased on the linear assignment results, we keep two lists for unmatched detections and unmatched trackers, respectively. When a car enters into a frame and is first detected, it is not matched with any existing tracks, thus this particular detection is referred to as an unmatched detection, as shown in the following figure. In addition, any matching with an overlap less than ```iou_thrd``` signifies the existence of \nan untracked object. When a car leaves the frame, the previously established track has no more detection to associate with. In this scenario, the track is referred to as unmatched track. Thus, the tracker and the detection associated in the matching are added to the lists of unmatched trackers and unmatched detection, respectively.\n\n\u003cimg src=\"example_imgs/detection_track_match.png\" alt=\"Drawing\" style=\"width: 300px;\"/\u003e\n\n## Pipeline\n\nWe include two important design parameters, ```min_hits``` and ```max_age```, in the pipeline.  The parameter ```min_hits``` is the number of consecutive matches needed to establish a track. The parameter ```max_age``` is number of consecutive unmatched detections before a track is deleted. Both parameters need to be tuned to improve the tracking and detection performance.\n\nThe pipeline deals with matched detection, unmatched detection, and unmatched trackers sequentially. We annotate the tracks that meet the ```min_hits``` and ```max_age``` condition. Proper book keeping is also needed to delete the stale tracks. \n\nThe following examples show the process of the pipeline. When the car is first detected in the first video frame, running the following line of code returns an empty list, an one-element list, and an empty list  for ```matched```, ```unmatched_dets```, and ```unmatched_trks```, respectively. \n\n```\nmatched, unmatched_dets, unmatched_trks \\\n    = assign_detections_to_trackers(x_box, z_box, iou_thrd = 0.3) \n```\nWe thus have a situation of unmatched detections. Unmatched detections are processed by the following code block:\n\n```\nif len(unmatched_dets)\u003e0:\n        for idx in unmatched_dets:\n            z = z_box[idx]\n            z = np.expand_dims(z, axis=0).T\n            tmp_trk = Tracker() # Create a new tracker\n            x = np.array([[z[0], 0, z[1], 0, z[2], 0, z[3], 0]]).T\n            tmp_trk.x_state = x\n            tmp_trk.predict_only()\n            xx = tmp_trk.x_state\n            xx = xx.T[0].tolist()\n            xx =[xx[0], xx[2], xx[4], xx[6]]\n            tmp_trk.box = xx\n            tmp_trk.id = track_id_list.popleft() # assign an ID for the tracker\n            tracker_list.append(tmp_trk)\n            x_box.append(xx)\n```\nThis code block carries out two important tasks, 1) creating a new tracker ```tmp_trk``` for the detection; 2) carrying out the Kalman filter's predict stage ```tmp_trk.predict_only()```. Note that this newly created track is still in probation period, i.e., ```trk.hits =0```, so this track is yet established at the end of pipeline. The output image is the same as the input image - the detection bounding box is not annotated.\n\u003cimg src=\"example_imgs/frame_01_det_track.png\" alt=\"Drawing\" style=\"width: 150px;\"/\u003e\n\nWhen the car is  detected again in the second video frame, running the following ```assign_detections_to_trackers``` returns an one-element list , an empty list, and an empty list for matched, unmatched_dets, and unmatched_trks, respectively. As shown in the following figure, we have a matched detection, which will be processed by the following code block:\n\n```\nif matched.size \u003e0:\n        for trk_idx, det_idx in matched:\n            z = z_box[det_idx]\n            z = np.expand_dims(z, axis=0).T\n            tmp_trk= tracker_list[trk_idx]\n            tmp_trk.kalman_filter(z)\n            xx = tmp_trk.x_state.T[0].tolist()\n            xx =[xx[0], xx[2], xx[4], xx[6]]\n            x_box[trk_idx] = xx\n            tmp_trk.box =xx\n            tmp_trk.hits += 1\n```\nThis code block carries out two important tasks, 1) carrying out the Kalman filter's prediction and update stages ```tmp_trk.kalman_filter()```; 2) increasing the hits of the track by one ```tmp_trk.hits +=1```. With this update,  \nthe condition ```if ((trk.hits \u003e= min_hits) and (trk.no_losses \u003c=max_age)) ``` is statified, so the track is fully established. As the result, the bounding box is annotated in the output image, as shown in the figure below.\n\u003cimg src=\"example_imgs/frame_02_det_track.png\" alt=\"Drawing\" style=\"width: 150px;\"/\u003e\n## Issues\n\nThe main issue is occlusion. For example, when one car is passing another car, the two cars can be very close to each other. This can fool the detector into outputing a single (and possibly bigger bounding) box, instead of two separate bounding boxes. In addition, the tracking algorithm may treat this detection as a new detection and sets up a new track.  The tracking algorithm may fail again when one of the passing car moves away from another car. \n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkcg2015%2FVehicle-Detection-and-Tracking","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkcg2015%2FVehicle-Detection-and-Tracking","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkcg2015%2FVehicle-Detection-and-Tracking/lists"}