{"id":15159296,"url":"https://github.com/gaelmoccand/roadsegmentation_cnn","last_synced_at":"2026-01-20T11:32:30.255Z","repository":{"id":212842770,"uuid":"283419698","full_name":"gaelmoccand/RoadSegmentation_CNN","owner":"gaelmoccand","description":"Road Segmentation.Image Segmentation using CNN Tensorflow with SegNet","archived":false,"fork":false,"pushed_at":"2020-11-28T17:42:43.000Z","size":25461,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-02-13T18:49:28.719Z","etag":null,"topics":["cnn","cnn-for-visual-recognition","computer-vision","convolutional-neural-network","deep-learning","f","road-segmentation","segmentation","segnet","tensorflow","tensorflow-experiments"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gaelmoccand.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-07-29T06:38:17.000Z","updated_at":"2022-01-10T23:59:34.000Z","dependencies_parsed_at":null,"dependency_job_id":"6fceb06a-331a-499f-abb2-783059f93be2","html_url":"https://github.com/gaelmoccand/RoadSegmentation_CNN","commit_stats":{"total_commits":112,"total_committers":3,"mean_commits":"37.333333333333336","dds":0.3571428571428571,"last_synced_commit":"3d98f84d7f9748b6a19da5bbf2163b849d7f5e28"},"previous_names":["gaelmoccand/roadsegmentation_cnn"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaelmoccand%2FRoadSegmentation_CNN","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaelmoccand%2FRoadSegmentation_CNN/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaelmoccand%2FRoadSegmentation_CNN/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gaelmoccand%2FRoadSegmentation_CNN/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gaelmoccand","download_url":"https://codeload.github.com/gaelmoccand/RoadSegmentation_CNN/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247687220,"owners_count":20979422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cnn","cnn-for-visual-recognition","computer-vision","convolutional-neural-network","deep-learning","f","road-segmentation","segmentation","segnet","tensorflow","tensorflow-experiments"],"created_at":"2024-09-26T21:02:55.658Z","updated_at":"2026-01-20T11:32:30.249Z","avatar_url":"https://github.com/gaelmoccand.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Road Segmentation \n Road Segmentation.Image Segmentation using CNN tensorflow with SegNet\n \n **Abstract In  this  work  we  present  two  methods  to  segmentroads  on  satellite  images.  We  first  show  how  we  can  augmentan  image  dataset  when  the  one  at  disposal  is  too  small  toproperly train a machine learning algorithm. Then we quicklydemonstrate what features can be exploited and how to handlethem in order to make the best prediction with a linear logisticregression. Finally, we present a method based on a deep fullyconvolutional  neural  network  architecture  for  semantic  pixel-wise  segmentation  called  SegNet.\n **\n \n ## Introduction\n The goal of this work is to segment roads on satellite images (Figure 1) by using machine learning techniques.\nIn other words, we want to assign a label (road or background) to each pixel of the image. Before selecting\nthe best algorithm, an effort is made on how to augment\na small image data set and how to get the most relevant\nfeatures out of it. Then we present 2 different classes\nof algorithm. The first one is a linear logistic regression\nwhereas the second one, called SegNet [1] uses a more\ncomplicated scheme based on a convolutional neural\nnetwork (CNN).\n\n![Fig1. Exampel of satellite image ](projectRoadSegmentation/report/pics/satImage.png)\n\nFig1. Exampel of satellite image \n\nA set of N = 100 training images of size 400 × 400\npixels is provided. The set contains different aerial\npictures of urban areas. Together with this training set, the\nFigure 2.\nGround truth of satellite image example.\ncorresponding ground truth grayscale images (Figure 2) are\nalso available. Note that the ground truth images need to\nbe converted into label images. Concretely, each pixel y i\ncan only take one of the two possible values corresponding\nto the classes: road label (y i = 1) or background label\n(y i = 0). In order to binarize the ground truth images, a\nthreshold of 25% is set. This means that every pixel with\nan intensity lower than 75% of the maximum possible value\nis set to 0 and the rest is set to 1. With 8 bits images, the\nmaximum value is 255 which sets the threshold to 191.\nThis pixel threshold has a direct impact on the width of the\nroad label in the computed label image.\n \n ![Fig2. Ground truth of satellite image example ](projectRoadSegmentation/report/pics/satImage_gt.png)\n \n Fig2. Ground truth of satellite image example \n \n The problem that arises with such a small training set\n(100 images only) is overfitting. Moreover in order to train\nany convolutional neural network properly it is necessary\nto augment the dataset. Analysing the training set, it is\nobvious that it contains mainly pictures with strictly vertical\nand horizontal roads. For that reason, creating new images\nby rotating the original ones allows to increase the size of\nthe dataset and generates data which will be useful to better\ntrain the algorithm. Specifically, we rotate each image by\nangles going from 5 to 355 degrees every 5 degrees (i.e. 5,\n10, 15,..., 355). That way we generate a set of images with\nroads in every directions. In summary, for each image of\nthe original training set, 70 images are generated using therotations, resulting in a new training set of 7100 images.\nThis augmented training set is then suitable for the training\nof the CNN.\n\n## Methodology\n\nIn order to gain computational efficiency, square patches\ncan be used instead of working with every pixels (see Figure\n3). This make sense because a road is never composed by\na single pixel but is rather made of blocks of pixels. The\nsmaller the patch, the longer the simulations, but the finer\nthe prediction. It is therefore important to find a trade-off.\nWe’ve found that taking patches of size 8 × 8 gives decent\nresult in a reasonable time. Within each patch, the mean and\nthe variance in the 3 channels (RGB) are computed. On top\nof these 6 features, we add the computation of the histogram\nof oriented gradients (HOG) in 8 directions. The HOG is a\ndescriptor used in many computer vision tasks for object\ndetection purpose. It also consists of splitting the image in\npatches and gives their gradient orientation quantized by the\nangle and weighted by the magnitude. This makes a total\nof 14 features per patch. Since we have 50 × 50 = 2500\npatches, it makes a total of 35000 features per image.\n\n![Fig3. Ground truth of satellite image example ](projectRoadSegmentation/report/pics/prediction_patch.png)\n\nFig3. Example of a prediction using 16 × 16 patches. The predicted\nroad are in red. Each red square correspond to a patch\n\nThe feature matrix is pretty sparse like shown on Figure\n4. The histogram shows a large peak of zeros followed by a\ndecay. This decay-like shape suggests us to manipulate the\nfeatures in order to get a distribution following a normal\ndistribution. This can be obtained by taking the square root\nof the features and can be observed on Figure 5. These\nfeatures are fed to a simple linear logistic regression using\nscikit-learn.\n\n\n![Fig4. ](projectRoadSegmentation/report/pics/hist_feats.png)\n\nFig4. Histogram of the features computed on one of the satellite image\n\n![Fig5. ](projectRoadSegmentation/report/pics/hist_sqrt_feats.png)\n\nFig5. Histogram of the square root of the features computed on the\nsame image\n\nAs a second step, we use the SegNet architecture which is\na deep fully convolutional neural network. The SegNet archi-\ntecture consists of a sequence of non-linear processing layers\n(encoders) and a corresponding set of decoders followed by\na pixelwise classifier. Typically, each encoder consists of one\nor more convolutional layers with batch normalisation and a\nrectified linear unit (ReLU) non-linearity, followed by non-\noverlapping maxpooling and sub-sampling [2]. The Figure\n6 shows the SegNet overall architecture.\nSegmentation problems often use spatial softmax to try to\nclassify each pixel. The Loss L used in SegNet is basically\na Spatial Multinomial Cross-Entropy that runs on each pixel\nof the output tensor, compared to the label image.\nThe SegNet implementation in tensorflow is taken from\nthe github reference code of Leonardo Araujo [3]. Two\nversions of SegNet are available: ”connected” and ”gate\nconnected”. Both versions take the output of the previous\nconvolver and the output convolver of the corresponding\nEncoder part as input for the Decoder.\n\n![Fig6. ](projectRoadSegmentation/report/pics/segnet.png)\n\nFig6. SegNet architecture\n\nIn order to apply a cross-validation, the training set is randomly split into 2 parts. 80% is used for training (5680\nimages) and 20% (1420 images) for testing. For SegNet,\nwe use a mini batch of 50 images. The initial learning\nrate is set to 0.001 with a decay every 10000 steps. Note\nalso that the size of the image is reduced to 224 × 224\npixels in order to speed up the training of the neural network.\n\nTo compare the methods, we compute the percentage of\naccurate patches (first method) or pixels (second method)\nprediction using the test set.\n\n## Results\nThe first method using the linear logistic regression has\nbeen put aside pretty quickly because even with the fea-\ntures transformation (taking the square root), the prediction\ndidn’t exceed 0.59 on the test set. Taking into account that\nprediction by flipping a coin would give 0.5, it is not a big\nachievement. This is probably due to the fact that the mean,\nvariance and HOG are not sufficient to differentiate the roads\nfrom the rest of the objects.\nRegarding SegNet, the results are way more encouraging.\nIndeed the prediction rate almost reaches 0.9, with both\nSegNet connected (0.86) and connected gate (0.87).\n\n\n![Fig7. ](projectRoadSegmentation/report/pics/pred.png)\n\nFig7. Complex example of satellite image. There are roads in many\ndirections and trees on the road\n\n\n![Fig8. ](projectRoadSegmentation/report/pics/pred_label.png)\n\nFig8. Prediction of the complex example using SegNet\n\n## Discussion\nThe logistic regression yields pretty disappointing results.\nThis can be explained by the reasons given above and by\nthe fact that using patches has a main drawback. With this\nmethod we loose the continuity of the image and thus the\ninformation of the neighbor pixels at the boundaries of the\npatches. For instance, since a road is continuous, there is a\nbigger chance that a pixel is a road if its neighbors are roads\nthan if they are part of the background. With more time, it\nwould be interesting to keep the same prediction method but\nuse much more features and possibly get information on the\nneighbor patches. However, finding most relevant features\nis a tedious job. For this reason we decided to use a deep\nlearning method instead.\nIn the case of SegNet, we were expecting higher scores but\nit actually overfits a bit. This can be observed on Figure 9\nwhich shows that after 7 iterations, the spatial loss increases\nagain. To avoid this behavior, it would be good to have a\nbroader training set in the sense that its images do not differ\nmuch. It would be also good to try to tune Segnet to have better results.\n\n![Fig9. ](projectRoadSegmentation/report/pics/overfitting.png)\n\nFig9. Spatial loss w.r.t the epoch number. It overfits after epoch 6.\n\n## Summary\nIn this work we have shown how to augment an images\ntraining using rotations. Furthermore, we have presented\nthe convolutional neural network SegNet which yields a\nfairly good prediction for road segmentation on satellite images. However, one must pay attention to overfitting very\ncarefully.\n\n## References\n\n[1] V. Badrinarayanan, A. Kendall, and R. Cipolla, “Segnet: A\ndeep convolutional encoder-decoder architecture for image\nsegmentation,” CoRR, vol. abs/1511.00561, 2015. [Online].\nAvailable: http://arxiv.org/abs/1511.00561\n\n[2] V. Badrinarayanan, A. Handa, and R. Cipolla, “SegNet: A\nDeep Convolutional Encoder-Decoder Architecture for Robust\nSemantic Pixel-Wise Labelling,” ArXiv e-prints, May 2015.\n\n[3] L. Araujosantos, “Learn Segmentation,” https://github.com/\nleonardoaraujosantos/LearnSegmentation, 2017.\n\n\n \n [Report can be found here in pdf](projectRoadSegmentation/bazinga-submission.pdf)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaelmoccand%2Froadsegmentation_cnn","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgaelmoccand%2Froadsegmentation_cnn","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgaelmoccand%2Froadsegmentation_cnn/lists"}