{"id":35276311,"url":"https://github.com/nirabo/carnd-traffic-sign-classifier","last_synced_at":"2026-05-18T08:35:09.834Z","repository":{"id":180440269,"uuid":"87177768","full_name":"nirabo/carnd-traffic-sign-classifier","owner":"nirabo","description":"Traffic Sign Classification using a CNN","archived":false,"fork":false,"pushed_at":"2018-01-15T22:51:00.000Z","size":22321,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2024-06-03T12:05:47.371Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/nirabo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-04-04T11:17:23.000Z","updated_at":"2024-06-03T12:05:50.918Z","dependencies_parsed_at":null,"dependency_job_id":"e2913f7c-9636-4830-be98-44340b8c9e41","html_url":"https://github.com/nirabo/carnd-traffic-sign-classifier","commit_stats":null,"previous_names":["bubalazi/carnd-traffic-sign-classifier","nirabo/carnd-traffic-sign-classifier"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/nirabo/carnd-traffic-sign-classifier","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nirabo%2Fcarnd-traffic-sign-classifier","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nirabo%2Fcarnd-traffic-sign-classifier/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nirabo%2Fcarnd-traffic-sign-classifier/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nirabo%2Fcarnd-traffic-sign-classifier/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/nirabo","download_url":"https://codeload.github.com/nirabo/carnd-traffic-sign-classifier/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/nirabo%2Fcarnd-traffic-sign-classifier/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":33170904,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-18T05:43:36.989Z","status":"ssl_error","status_checked_at":"2026-05-18T05:43:19.133Z","response_time":71,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-12-30T13:58:47.555Z","updated_at":"2026-05-18T08:35:09.828Z","avatar_url":"https://github.com/nirabo.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **CarND Project 2: Traffic Sign Recognition**\nAuthor: Lyuboslav Petrov\n\n\u003c!--Image References--\u003e\n[snow]: ./data/de_traffic_signs/1_snow.png \"General Caution in Snow\"\n[noentry]: ./data/de_traffic_signs/2_noentry.png \"No Entry\"\n[noentry2]: ./data/de_traffic_signs/3_noentry.png \"No Entry drawing\"\n[roundabout]: ./data/de_traffic_signs/4_roundabout.png \"Roundabout\"\n[limit30]: ./data/de_traffic_signs/5_limit30.png \"Speed Limit 30\"\n[softmax]: ./doc/softmax.png \"Softmax probabilities\"\n[real-world]: ./doc/real-world.png \"Real World Accuracy\"\n[results]: ./doc/results.png \"Training Results after 50 Epochs\"\n[balanced_train]: ./doc/balanced_train.png \"Balanced out classes\"\n[balanced]: ./doc/balanced.png \"Balanced out classes\"\n[perturbations]: ./doc/perturbations.png \"Image Perturbations\"\n[classes]: ./doc/all_classes.png \"Examples of all classes\"\n[distribution]: ./doc/distribution.png \"Sample Size Distribution\"\n\n---\n## Summary\n\nThis below outlines the work performed for analyzing the [German Traffic Sign Benchmark Dataset](http://benchmark.ini.rub.de/)  from the Ruhr-University, Bochum, Germany, as\npart of the [Self Driving Car Engineer](https://www.udacity.com/) nanodegree from Udacity.\nA convolutional neural network was trained with a validation accuracy of ~94% and testing\naccuracy of ~95%. Real-world testing with images from the internet showed results\napproaching 30% accuracy.\n\n---\n\n## Introduction\n\nOne of the main characteristics of a legalized road are its signs and markings.\nIt is therefore of great interest to the self-driving car research domain to\nfind accurate, fast and resilient algorithms for image based road sign detection\nand classification. This work details the ***classification*** aspect, assuming the\nsigns were already detected.\n\n---\n\n## Methods\n\n### Data Summary\n\nThe dataset consists of labeled images organized in train, test and validation sets:\n  * Number of training examples: **34799**\n  * Number of testing examples: **4410**\n  * Image data shape: **32, 32, 3**\n  * Number of classes: **43**\n\n![][classes]\n\nIt is evident from the above figure, that there is a great variation of brightness,\ncontrast and resolution in the data. However, the targets/sings are brought to the\nimage foreground and populate the centre of every sample with the majority of pixels,\nin most cases, belonging to the signs.\n\nThe samples per class distribution of all the sets can be seen below:\n\n![][distribution]\n\nAs can be seen, the distributions along the different sets are very close, but\nthe distribution of samples among classes is of great variance. It was therefore\nnecessary to balance-out the classes by generating *surrogate* data, based on the\nexisting dataset.\n\n### Pre-processing Steps\n\nAlthough the images have already undergone preprocessing steps (ROI cropping),\nfurther preprocessing was seen an efficient method for optimizing performance.\n\n#### Class Balancing with Surrogate data\n\nSince sample distribution among classes was seen to be greatly varying, it was\ndecided to augment the lower-sample-count classes with artificially created data.\n\nAnother benefitial aspect in adding perturbations to the data is that in this manner\nthe network becomes more robust and less likely to overfit.\n\nThe methods for creation of this data where all based on perturbing the existing\nsamples, where the perturbations chosen where:\n\n1. Image **Rotation** by +- 6 to 9 degrees around the image centre\n2. Image **Translation** by +- 3 pixels along the x and y axes\n3. Image **Affine** transformation\n4. Image **Perspective** warping\n\n![][perturbations]\n\nUsing these techniques for image generation and class balancing, several balancing\nthresholds where tested, namely **median**, **mean** and **max** counts of all samples\namong the classes and the **max** threshold was chosen as final.\n\nThe results from class balancing can be seen below.\n\n***NOTE:*** Images where converted to float32 and therefore their colorspace\nis depicted differently by matplotlib.\n\n![][balanced]\n\nThe resulting distribution for the training is shown below, where the total training\nset size changed from 34799 to 86429, hence the surrogate data represents **~60%** of\nall training data!\n\n![][balanced_train]\n\n#### Grayscale and Normalization\n\nImages where then converted to grayscale and normalized between 0 and 1.\n\n### Network Architecture\n\nSevaral network architectures were iterated through. First, the LeNet convolutional network was taken and adapted to work with the traffic sign data set - adapting it to 43 categories, instead of 10. On the first iterations it was\nobserved that the 3 channels of the image do not contribute towards better accuracy and the pre-processing now included not only normalization, but also a colorspace conversion to grayscale. In addition, multiple filter sizes where tested with the LeNet architecture, when the necessity of paramtrization was recognized (see below). Further, two dropout layers were added after the first two Fully-Connected layers which brought the accuracy towards 0.8-0.9. Multiple filter depths were tested, and with filter depths of (64, 128) for the first two convolutional layers, the network reached 0.91 accuracy. A further test was made with addition of a third convolutional layer, where final results came to ~0.95 accuracy. \n\nDetails of the layers dimensions can be found below.\n\nIn order to iterate through multiple network architectures, it is necessary\nto make the network models parametric, so interdependencies between variables can\nbe solved dynamically.\n\nFist, the layer dimensions are sequentially defined. Example:\n\n    layers = {}\n    layers.update({\n        'c1':{\n            'd': n_channels * 9,\n            'fx': 5,\n            'fy': 5\n        }\n    })\n    layers.update({\n        'c2':{\n            'd': layers['c1']['d'] * 6,\n            'fx': 5,\n            'fy': 5\n        }   \n    })\n    layers.update({\n        'c3':{\n            'd': layers['c2']['d'] * 4,\n            'fx': 5,\n            'fy': 5\n        }   \n    })\n    layers.update({\n        'f0': {\n            # Resulting flat size = n_channels * 9 * 6 * 4 = 1 * 9 * 6 * 4 = 216\n            'in': layers['c3']['d'],\n            'out': 480\n        }  \n    })\n    layers.update({\n        'f1': {\n            'in': layers['f0']['out'],\n            'out': 240\n        }\n    })\n    layers.update({\n        'f2': {\n            'in': layers['f1']['out'],\n            'out': 43\n        }\n    })\n\nNext, the weight and bias objects (python dictionaries) are constructed:\n\n    weights = {\n        'wc1': tfhe((layers['c1']['fx'], layers['c1']['fy'], n_channels, layers['c1']['d'])),\n        'wc2': tfhe((layers['c2']['fx'], layers['c2']['fy'], layers['c1']['d'], layers['c2']['d'])),\n        'wc3': tfhe((layers['c3']['fx'], layers['c3']['fy'], layers['c2']['d'], layers['c3']['d'])),\n        'wf0': tfhe((layers['f0']['in'], layers['f0']['out'])),\n        'wf1': tfhe((layers['f1']['in'], layers['f1']['out'])),\n        'wf2': tfhe((layers['f2']['in'], layers['f2']['out']))\n    }\n\n    biases = {\n        'bc1': tf.Variable(tf.zeros(layers['c1']['d'])),\n        'bc2': tf.Variable(tf.zeros(layers['c2']['d'])),\n        'bc3': tf.Variable(tf.zeros(layers['c3']['d'])),\n        'bf0': tf.Variable(tf.zeros(layers['f0']['out'])),\n        'bf1': tf.Variable(tf.zeros(layers['f1']['out'])),\n        'bf2': tf.Variable(tf.zeros(layers['f2']['out']))\n    }\n\nwhere, the *tfhe* function points to the initialization routine detialed in [1].\n\nThe initial architecture chosen was LeNet's convolutional network as detailed [here](https://classroom.udacity.com/nanodegrees/nd013/parts/fbf77062-5703-404e-b60c-95b78b2f3f9e/modules/6df7ae49-c61c-4bb2-a23e-6527e69209ec/lessons/601ae704-1035-4287-8b11-e2c2716217ad/concepts/d4aca031-508f-4e0b-b493-e7b706120f81). The parameter tweaking and\nperformance testing showed that stacking another convolutional layer is of greater\nbenefit then increasing the number of parameters (i.e. depth vs width).\n\nThe final architecture chosen is three subsequent convolutional layers with average pooling,\nequal strides and equal filter widths (**w_c(0,1,2) = 5x5**), and respective filter\ndepths (**d_c(0,1,2) = 9, 54, 220**). The following layers chosen are three subsequent fully connected layers with widths respectively (**w_fc(0,1,2) = 480, 240, 43**).\n\n| # | Layer                 |     Description\t                              |  Output\n|:-:|:---------------------:|:---------------------------------------------:|:-------------------:|\n|1  | Input                 | Grayscale image                               | 32x32x1\n|2  | Convolution (5x5x9)   | 1x1 Stride, Valid Padding                     | 28x28x9\n|3  | ReLu                  |                                               | 28x28x9\n|4  | Average pooling       | 2x2 stride                                    | 14x14x9\n|5  | Convolution (5x5x54)  | 1x1 Stride, Valid Padding                     | 10x10x54\n|6  | ReLu                  |                                               | 10x10x54\n|7  | Average pooling       | 2x2 stride                                    | 5x5x54\n|8  | Convolution (5x5x216) | 1x1 Stride, Valid Padding                     | 1x1x216\n|9  | ReLu                  |                                               | 1x1x216\n|10 | Average pooling       | 2x2 stride                                    | 1x1x216\n|11 | Fully connected\t\t| Flattened network (1x216)                     | 1x480\n|12 | Dropout       \t\t| val=0.85                                      | 1x480\n|13 | Fully connected\t\t|                                               | 1x240\n|14 | Dropout       \t\t| val=0.85                                      | 1x240\n|15 | Fully connected\t\t|                                               | 1x43\n\n### Train - Validate - Test\n\nThe network was trained and optimized for **50 Epochs** with a **Batch Size of 128** using:\nFor each image, discuss what quality or qualities might be difficult to classify.\n| # | Layer                 |     Description\t                            |  Output\n|:-:|:---------------------:|:---------------------------------------------:|:-------------------:|\n| 1 | Softmax               | Cross Entropy with Logits                     | 1x43\n| 2 | Loss Operation        | Reduce entropy with mean                      | 1x43\n| 3 | Optimizer             | Adam Optimizer (learning_rate = 0.0007)       | 1x43\n\n---\n\n## Results\n\n### Validation Accuracy\n![][results]\n\n### Testing Accuracy\nThe training accuracy achieved was in the range of 0.950-0.960\n\n### Real World Testing Accuracy\n\nTesting with images downloaded from a google image search with key-words: \"German traffic signs\"\nresulted in accuracy of **0.30**.\n\n![][real-world]\n\nThe softmax probabilities 5 randomly chosen real-world images are as follows:\n\n![][softmax]\n\nThe individual images can below be seen in full size with their supporting discussion.\n\n#### 1. General Caution in Snow\n\n![][snow]\n\nThe top 5 probabilities are far away from correct.\n\nDifficulties for classification:\n1. Snow! This is an image for a General Caution sign in the winter, partially covered in snow.\n2. Size ratio - the sign area is much smaller than the complete image area (\u003c\u003c 0.5), whereas the training set had a sign to image size ratio of approx 0.5\n3. Multiple Signs and overlayed text\n\n#### 2. No Entry under a high angle\n\n![][noentry]\n\nDifficulties for classification:\n\n1. Sign centre is shifted towards the upper edge of the image\n2. The pose of the sign relative to the camera is not favorable to the algorithm\n3. Size ratio\n\n#### 3. No Entry drawing\n\n![][noentry2]\n\nThis image is a drawing and is as expected classified with probability of 1.0 \n\n#### 4. Roundabout\n\n![][roundabout]\n\nThe roundabout mandatory sign is as well classified with a high probability.\n\n#### 5. Small Limit 30\n\n![][limit30]\n\nProblems with this image are:\n\n1. Size ratio (sign area to image area)\n\n## Discussion\n\nThe task at hand was successfully completed by achieving a higher than the desired accuracy (testing and validation)\nof 93%. Further testing with random images from the internet showed the weakness\nof the model, namly that it expects the sign to be centrally located within the\nimage, to be of a certain resolution and to occupy the bulk are within the image.\nPrior data augmenting, real-world results were with accuracy \u003c 0.1. After augmentation, the results improved significantly.\n\nDuring training and testing it was noted that increasing the dropout rate above\n0.5 up to 0.9 had only benefitial effects. This can be attributed to the intentionally\nwide fully connected layers at the end, where a large dropout will still result in\nenough nodes left to achieve a good result.\n\nThe conversion to grayscale proved itself also very valuable. This operation is essentially\na pre-convolution that reduces the dimensionality of the input set, which in turn reduces\ndrastically the requirements for the network size with negligable information loss.\n\nIncreasing network depth was the final step following which the desired accuracy was exceeded.\n\n\n## Visualizing the network state\nN/A\n\n## References\n\n[1] Kaiming He, Xiangyu Zhang, Shaoqing Ren, Jian Sun, \"Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification\", CoRR, 2015\n\n\u003c!--  --\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnirabo%2Fcarnd-traffic-sign-classifier","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fnirabo%2Fcarnd-traffic-sign-classifier","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fnirabo%2Fcarnd-traffic-sign-classifier/lists"}