{"id":15906743,"url":"https://github.com/logancyang/behavioral-cloning","last_synced_at":"2025-07-28T20:08:53.822Z","repository":{"id":71209001,"uuid":"98061510","full_name":"logancyang/behavioral-cloning","owner":"logancyang","description":"Udacity SDCND: Teach a car to drive itself in a simulator by training convolutional neural networks using TensorFlow and Keras","archived":false,"fork":false,"pushed_at":"2017-07-24T20:59:51.000Z","size":458704,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-02T23:29:41.673Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/logancyang.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-07-22T22:36:10.000Z","updated_at":"2017-08-15T21:27:26.000Z","dependencies_parsed_at":"2023-05-23T12:00:38.850Z","dependency_job_id":null,"html_url":"https://github.com/logancyang/behavioral-cloning","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/logancyang/behavioral-cloning","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logancyang%2Fbehavioral-cloning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logancyang%2Fbehavioral-cloning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logancyang%2Fbehavioral-cloning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logancyang%2Fbehavioral-cloning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/logancyang","download_url":"https://codeload.github.com/logancyang/behavioral-cloning/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/logancyang%2Fbehavioral-cloning/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267578003,"owners_count":24110351,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-28T02:00:09.689Z","response_time":68,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-06T13:41:35.083Z","updated_at":"2025-07-28T20:08:53.800Z","avatar_url":"https://github.com/logancyang.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# **Behavioral Cloning** \n\n---\n\nThe goals / steps of this project are the following:\n* Use the simulator to collect data of good driving behavior\n* Build, a convolution neural network in Keras that predicts steering angles from images\n* Train and validate the model with a training and validation set\n* Test that the model successfully drives around track one without leaving the road\n* Summarize the results with a written report\n\n[//]: # (Image References)\n\n[image1]: ./examples/steering_hist.png \"Distribution of Steering Angles in Training Data\"\n[image2]: ./examples/train_val_loss.png \"Training and Validation loss\"\n[image3]: ./examples/left.jpg \"Left camera\"\n[image4]: ./examples/middle.jpg \"Middle camera\"\n[image5]: ./examples/right.jpg \"Right camera\"\n[image6]: ./examples/flipped_middle.jpg \"Flipped image\"\n\n---\n\n### Summary of Training Data\n\nThe driving log data consists of 8036 rows, each row has 3 images recorded by 3 virtual cameras on the vehicle: center, left and right.\nHere is a histogram to see the distribution of the steering angles. It is a little bit imbalanced because the vehicle drives\ncounter-clockwise on the track.\n\n![alt text][image1]\n\nThe training data consists of three types of images: left, middle, right generated from the cameras on the car. The sample images are shown below.\n\n![alt text][image3] ![alt text][image4] ![alt text][image5]\n\nData augmentation is used to generate more training data. The technique is to flip the image randomly and also flip the steering input. This simple technique was able to improve the performance of the model significantly.\n\n![alt text][image4] ![alt text][image6]\n\n### Model Architecture and Training Strategy\n\n#### 1. An appropriate model architecture has been employed\n\nMy initial model is LeNet but with continuous output. The initial data was collected by myself driving in the simulator\nfor 2 laps using only keyboard. The result wasn't very good because as some fellow students pointed out, keyboard's \nabrupted movements are not smooth and suitable for the model to learn from. The steering should be as smooth as possible, so\na game controller is recommended over the keyboard. But since I don't have a game controller at this time, I tried the sample\ndata instead. LeNet performed pretty well on the sample data. It was not bad for the straight or slightly curved lanes but when \nit reached sharper curves or the bridge where the ground has a different texture, it drove the car onto the curb and got stuck.\n\nMy next attempts were to augment the data as suggested by the lecture, and adopt a well-tested architecture in the literature,\nwhich is NVIDIA's \"End to End Learning for Self-Driving Cars\" \n[paper] (https://images.nvidia.com/content/tegra/automotive/images/2016/solutions/pdf/end-to-end-dl-using-px.pdf) (model.py lines 19-60) \n\nThe model includes RELU layers to introduce nonlinearity, and the data is normalized in the model using a Keras lambda layer (code line 20). \n\n#### 2. Attempts to reduce overfitting in the model\n\nThe model was trained and validated on different data sets to ensure that the model was not overfitting (model.py line 64-70).\nThe number of training samples for each epoch is 20000, and the number of validation samples is set to 6400. \nThe model was tested by running it through the simulator and ensuring that the vehicle could stay on the track.\n\n#### 3. Model parameter tuning\n\nThe model used an adam optimizer, so the learning rate was not tuned manually (model.py line 61). It was trained for\n4 epochs because further training didn't reduce the loss by a noticeable amount.\n\n#### 4. Appropriate training data\n\nInitially the data I collected was via keyboard. It was quite abrupt and jerky because it was hard to maintain a smooth input\nwith the keyboard. As expected, the result was also jerky. The car frequently adjusted steering angles even on straight lanes.\n\nThis is a regression task using convolutional neural networks,\nhence there is an important note for training these models - \"garbage in, garbage out\". With the sample data and data augmentation,\nI was able to improve the model output by a lot. I trained for 4 epochs since further training didn't appear to be very helpful. \nThe final validation loss was 0.0102. \n\nThe following diagram shows the training and validation losses in the training process over the number of epochs,\n\n![alt text][image2]\n\n### Model Architecture and Training Strategy\n\n#### 1. Solution Design Approach\n\nFollowing the suggestion of the lecture, I tried the LeNet model first and then moved to the NVidia model.\n\nIn order to gauge how well the model was working, I implemented the image batch generator to populate the training and \nvalidation sets. I found that my first model had a low mse loss on the training set but a high loss on the validation set. \nThis implied that the model was overfitting. So I added data augmentation to generate more training samples. Since the car \ndrives counter-clockwise on the track, the left and right steering data are not balanced. So I augmented the image data by flipping them\nto horizontally symmetric ones randomly at a probability of 0.5. I also cropped and resized the images to only focus on\nthe road instead of the irrelevant parts.\n\nThe final step was to run the simulator to see how well the car was driving around track one. Without data augmentation, the car tended\nto steer to the right more because of the imbalance left and right steering angles in the data. With random flipping, the car drove significantly \nbetter. \n\nOn the model side, the NVidia paper provided a powerful solution to this problem. The paper described the architecture as follows.\nThe network consists of 9 layers, including a normalization layer, 5 convolutional layers and 3 fully connected layers.\nThe first layer of the network performs image normalization. Performing normalization in the network allows the \nnormalization scheme to be altered with the network architecture and to be accelerated via GPU processing.\nThe convolutional layers were designed to perform feature extraction and were chosen empirically through a series of \nexperiments that varied layer configurations. The model used strided convolutions in the first three convolutional layers with a \n2×2 stride and a 5×5 kernel and a non-strided convolution with a 3×3 kernel size in the last two convolutional layers.\nIt follows the five convolutional layers with three fully connected layers leading to an output control value which is \nthe inverse turning radius. The fully connected layers are designed to function as a controller for steering.\n\nThe final model appeared to be easy to train and effective. At the end of the process, \nthe vehicle is able to drive autonomously around the track without leaving the road.\n\n#### 2. Final Model Architecture\n\nThe architecture is shown below.\n\n```\n____________________________________________________________________________________________________\nLayer (type)                     Output Shape          Param #     Connected to                     \n====================================================================================================\nlambda_1 (Lambda)                (None, 64, 64, 3)     0           lambda_input_1[0][0]             \n____________________________________________________________________________________________________\nconvolution2d_1 (Convolution2D)  (None, 32, 32, 24)    1824        lambda_1[0][0]                   \n____________________________________________________________________________________________________\nactivation_1 (Activation)        (None, 32, 32, 24)    0           convolution2d_1[0][0]            \n____________________________________________________________________________________________________\nmaxpooling2d_1 (MaxPooling2D)    (None, 31, 31, 24)    0           activation_1[0][0]               \n____________________________________________________________________________________________________\nconvolution2d_2 (Convolution2D)  (None, 16, 16, 36)    21636       maxpooling2d_1[0][0]             \n____________________________________________________________________________________________________\nactivation_2 (Activation)        (None, 16, 16, 36)    0           convolution2d_2[0][0]            \n____________________________________________________________________________________________________\nmaxpooling2d_2 (MaxPooling2D)    (None, 15, 15, 36)    0           activation_2[0][0]               \n____________________________________________________________________________________________________\nconvolution2d_3 (Convolution2D)  (None, 8, 8, 48)      43248       maxpooling2d_2[0][0]             \n____________________________________________________________________________________________________\nactivation_3 (Activation)        (None, 8, 8, 48)      0           convolution2d_3[0][0]            \n____________________________________________________________________________________________________\nmaxpooling2d_3 (MaxPooling2D)    (None, 7, 7, 48)      0           activation_3[0][0]               \n____________________________________________________________________________________________________\nconvolution2d_4 (Convolution2D)  (None, 7, 7, 64)      27712       maxpooling2d_3[0][0]             \n____________________________________________________________________________________________________\nactivation_4 (Activation)        (None, 7, 7, 64)      0           convolution2d_4[0][0]            \n____________________________________________________________________________________________________\nmaxpooling2d_4 (MaxPooling2D)    (None, 6, 6, 64)      0           activation_4[0][0]               \n____________________________________________________________________________________________________\nconvolution2d_5 (Convolution2D)  (None, 6, 6, 64)      36928       maxpooling2d_4[0][0]             \n____________________________________________________________________________________________________\nactivation_5 (Activation)        (None, 6, 6, 64)      0           convolution2d_5[0][0]            \n____________________________________________________________________________________________________\nmaxpooling2d_5 (MaxPooling2D)    (None, 5, 5, 64)      0           activation_5[0][0]               \n____________________________________________________________________________________________________\nflatten_1 (Flatten)              (None, 1600)          0           maxpooling2d_5[0][0]             \n____________________________________________________________________________________________________\ndense_1 (Dense)                  (None, 1164)          1863564     flatten_1[0][0]                  \n____________________________________________________________________________________________________\nactivation_6 (Activation)        (None, 1164)          0           dense_1[0][0]                    \n____________________________________________________________________________________________________\ndense_2 (Dense)                  (None, 100)           116500      activation_6[0][0]               \n____________________________________________________________________________________________________\nactivation_7 (Activation)        (None, 100)           0           dense_2[0][0]                    \n____________________________________________________________________________________________________\ndense_3 (Dense)                  (None, 50)            5050        activation_7[0][0]               \n____________________________________________________________________________________________________\nactivation_8 (Activation)        (None, 50)            0           dense_3[0][0]                    \n____________________________________________________________________________________________________\ndense_4 (Dense)                  (None, 10)            510         activation_8[0][0]               \n____________________________________________________________________________________________________\nactivation_9 (Activation)        (None, 10)            0           dense_4[0][0]                    \n____________________________________________________________________________________________________\ndense_5 (Dense)                  (None, 1)             11          activation_9[0][0]               \n====================================================================================================\nTotal params: 2,116,983\nTrainable params: 2,116,983\nNon-trainable params: 0\n```\n\n#### 3. End Result\n\nI recorded the final result in autonomous mode into a mp4 file and uploaded it \n[here](https://www.youtube.com/watch?v=pDdN28Bdm-o\u0026feature=youtu.be).\n\n#### References\n- [Nvidia: End to End Learning for Self-Driving Cars](https://arxiv.org/abs/1604.07316)\n- [Must Know Tips/Tricks in Deep Neural Networks](http://lamda.nju.edu.cn/weixs/project/CNNTricks/CNNTricks.html)\n- [An overview of gradient descent optimization algorithms](http://ruder.io/optimizing-gradient-descent/index.html)\n- [Striving for Simplicity: The All Convolutional Net](https://arxiv.org/abs/1412.6806)\n- [Spatial Dropout](https://faroit.github.io/keras-docs/1.1.1/layers/core/#spatialdropout2d)\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogancyang%2Fbehavioral-cloning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flogancyang%2Fbehavioral-cloning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flogancyang%2Fbehavioral-cloning/lists"}