{"id":15159320,"url":"https://github.com/zafarrehan/tensorflow_transfer_learning","last_synced_at":"2026-01-19T08:31:55.765Z","repository":{"id":231157076,"uuid":"469372034","full_name":"zafarRehan/tensorflow_transfer_learning","owner":"zafarRehan","description":"This repository explains how to train any pre-trained tensorflow model and use the existing weights to build your custom model in no time.","archived":false,"fork":false,"pushed_at":"2022-03-17T05:25:17.000Z","size":7340,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T18:49:30.880Z","etag":null,"topics":["object-detection","opencv-python","tensorflow-models","tensorflow2"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zafarRehan.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"license_plate_detection.ipynb","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2022-03-13T12:58:40.000Z","updated_at":"2022-03-13T18:43:09.000Z","dependencies_parsed_at":null,"dependency_job_id":"a60073e0-3ea0-4aa5-bc84-2d9926d31ff5","html_url":"https://github.com/zafarRehan/tensorflow_transfer_learning","commit_stats":{"total_commits":74,"total_committers":1,"mean_commits":74.0,"dds":0.0,"last_synced_commit":"783a44483587fc0e015db58ce6ba36db80538af3"},"previous_names":["zafarrehan/tensorflow_transfer_learning"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zafarRehan%2Ftensorflow_transfer_learning","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zafarRehan%2Ftensorflow_transfer_learning/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zafarRehan%2Ftensorflow_transfer_learning/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zafarRehan%2Ftensorflow_transfer_learning/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zafarRehan","download_url":"https://codeload.github.com/zafarRehan/tensorflow_transfer_learning/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247687220,"owners_count":20979422,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["object-detection","opencv-python","tensorflow-models","tensorflow2"],"created_at":"2024-09-26T21:04:35.451Z","updated_at":"2026-01-19T08:31:55.760Z","avatar_url":"https://github.com/zafarRehan.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tensorflow Transfer Learning\nThis repository explains how to perform transfer learning on any tensorflow pre-trained object-detection model.\u003c/br\u003e\nAny model listed in \u003ca href=\"https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md\"\u003eModel Zoo\u003c/a\u003e can be re-trained using this tutorial.\n\n## Why Transfer Learning?\nTraining a model to solve real world object detection problems is no easy task. It needs a lot of computing resources and time to train such models from scratch. \u003c/br\u003e\nUsing transfer learning we can use the existing weights of the pre-trained models and change just the last few layers to customize it to fit our own problem domain. \u003c/br\u003e\nThese \u003ca href=\"https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md\"\u003emodels\u003c/a\u003e are probably trained on super-computers which is impossible for many low to medium scale organizations to access or to afford. \u003c/br\u003e\n\nI trained my licence detection model in less than 3 hours on Google Colab and used the output model to detect licence plates on images here: https://github.com/zafarRehan/licence_plate_detection\n\nNow let's jump into using the code.\n\n## Running the default code in Colab\nThe repository contains the Notebook \u003ca href=\"/license_plate_detection.ipynb\"\u003elicense_plate_detection.ipynb\u003c/a\u003e which can be downloaded and executed directly on Google Colab.\nEverything is pre-feeded in the Notebook, from datset to configuration files. \u003c/br\u003e\nJust click on \u003cb\u003eRuntime -\u003e Run all\u003c/b\u003e then sit back and relax and watch your custom model being built.\u003c/br\u003e\n\nThe dataset I used here is from Kaggle https://www.kaggle.com/andrewmvd/car-plate-detection which contains 432 annotated images of cars with licence plates.\u003c/br\u003e\n\nThe code is well-commented so each step is explained in comments in the code.\n\n\u003c/br\u003e\n\u003ch3\u003eOutput\u003c/h3\u003e\n\u003cimg src=\"/images/out1.png\" width=600/\u003e\n\n\u003c/br\u003e\n\u003c/br\u003e\n\n## Training your own Model\nOur main goal here is to train our own Object Detection model with excellent performance and in no time.\u003c/br\u003e\n\nFirst and foremost we need data to train our model on. You can download any annotated dataset from Kaggle, or \u003ca href=\"https://towardsai.net/p/computer-vision/50-object-detection-datasets-from-different-industry-domains\"\u003ehere\u003c/a\u003e or anywhere on the Internet.\u003c/br\u003e\n\nYou can create your own dataset for object detection for which you must have: \u003c/br\u003e\n1. Atleat 300 to 400 images containing the object(s)\n2. Annotating tool to draw the bounding boxes of the object(s), for example: https://www.youtube.com/watch?v=Tlvy-eM8YO4 (Recommended)\u003c/br\u003e\u003c/br\u003e\n\n## Changes to be made for Custom Training\nAs the problem changes so does varoius other parameters.\u003c/br\u003e\n\nIn order to demonstrate the changes I will take another example to walk you through the changes to be made and the challenges that can be faced while changin them.\n\n\u003cb\u003eDataset Used : \u003c/b\u003e https://www.kaggle.com/kbhartiya83/swimming-pool-and-car-detection\n\nThis dataset consist of 2 classes: \u003c/br\u003e \n1. Car \u003cbr\u003e\n2. Swimming Pool\n\nunlike the licence_plate_detection which has only one class \u003cb\u003e Licence \u003c/b\u003e \u003c/br\u003e\n\nTo handle this change in number of classes following changes must me made in: \u003c/br\u003e\n### custom.pbtxt\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=400\u003e\nBefore:\n\n    item\n    {\n        id :1\n        name :'licence'\n    }\n    \n    \n    \n    \n\u003c/td\u003e\n\u003ctd width=400\u003e\nAfter:\n\n    item\n    {\n        id :1\n        name :'car'\n    }\n    item\n    {\n        id: 2\n        name: 'pool'\n    }\n    \n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\nNote: The number of item should match number of classes in your dataset with proper name. \u003c/br\u003e\n\u003ch3\u003epipeline.config \u003c/h3\u003e\nat line 3:\u003c/br\u003e\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=400\u003e\nchange\n\n    num_classes: 1\n\u003c/td\u003e\n\u003ctd width=400\u003e\nto\n\n    num_classes: 2\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n    \nNote: The value of num_classes must be equal to number of classes / different objects to be detected in your dataset. (Here: 'Car', 'Pool').\u003c/br\u003e\n\n\u003c/br\u003e\n\n### Annotation Changes\nThe annotation files can be different for different datasets or your own created annotation.\nLet's compare our annotation file for the 2 datasets: \u003c/br\u003e\n\n\u003ctable\u003e\n\u003ctr\u003e\n\u003ctd width=400\u003e\n\u003cb\u003e1. Licence Plate Annotation File\u003c/b\u003e\n\n    \u003cannotation\u003e\n        \u003cfolder\u003eimages\u003c/folder\u003e\n        \u003cfilename\u003eCars2.png\u003c/filename\u003e\n        \u003csize\u003e\n            \u003cwidth\u003e400\u003c/width\u003e\n            \u003cheight\u003e400\u003c/height\u003e\n            \u003cdepth\u003e3\u003c/depth\u003e\n        \u003c/size\u003e\n        \u003csegmented\u003e0\u003c/segmented\u003e\n        \u003cobject\u003e\n            \u003cname\u003elicence\u003c/name\u003e\n            \u003cpose\u003eUnspecified\u003c/pose\u003e\n            \u003ctruncated\u003e0\u003c/truncated\u003e\n            \u003coccluded\u003e0\u003c/occluded\u003e\n            \u003cdifficult\u003e0\u003c/difficult\u003e\n            \u003cbndbox\u003e\n                \u003cxmin\u003e229\u003c/xmin\u003e\n                \u003cymin\u003e176\u003c/ymin\u003e\n                \u003cxmax\u003e270\u003c/xmax\u003e\n                \u003cymax\u003e193\u003c/ymax\u003e\n            \u003c/bndbox\u003e\n        \u003c/object\u003e\n    \u003c/annotation\u003e      \n\u003c/td\u003e   \n\u003ctd width=400\u003e\n\u003cb\u003e2. Satellite Car Pool Annotation File\u003c/b\u003e\u003c/br\u003e\n    \n    \u003c?xml version=\"1.0\"?\u003e\n    \u003cannotation\u003e\n        \u003cfilename\u003e000000001.jpg\u003c/filename\u003e\n        \u003csource\u003e\n            \u003cannotation\u003eArcGIS Pro 2.1\u003c/annotation\u003e\n        \u003c/source\u003e\n        \u003csize\u003e\n            \u003cwidth\u003e224\u003c/width\u003e\n            \u003cheight\u003e224\u003c/height\u003e\n            \u003cdepth\u003e3\u003c/depth\u003e\n        \u003c/size\u003e\n        \u003cobject\u003e\n            \u003cname\u003e1\u003c/name\u003e\n            \u003cbndbox\u003e\n                \u003cxmin\u003e58.47\u003c/xmin\u003e\n                \u003cymin\u003e40.31\u003c/ymin\u003e\n                \u003cxmax\u003e69.58\u003c/xmax\u003e\n                \u003cymax\u003e51.43\u003c/ymax\u003e\n            \u003c/bndbox\u003e\n        \u003c/object\u003e\n        \u003cobject\u003e\n            \u003cname\u003e1\u003c/name\u003e\n            \u003cbndbox\u003e\n                \u003cxmin\u003e10.32\u003c/xmin\u003e\n                \u003cymin\u003e93.68\u003c/ymin\u003e\n                \u003cxmax\u003e21.43\u003c/xmax\u003e\n                \u003cymax\u003e104.80\u003c/ymax\u003e\n            \u003c/bndbox\u003e\n        \u003c/object\u003e\n    \u003c/annotation\u003e\n\u003c/td\u003e\n\u003c/tr\u003e\n\u003c/table\u003e\n\nWell there is difference right? \u003c/br\u003e\nTo handle these changes following code were changed: \u003c/br\u003e\n#### 1. In the notebook this code block was changed\n\n\u003cb\u003eFrom\u003c/b\u003e\n```python\n    import os\n    import glob\n    import pandas as pd\n    import xml.etree.ElementTree as ET\n\n    def xml_to_csv(path):\n        xml_list = []\n        for xml_file in glob.glob(path + '/*.xml'):\n            tree = ET.parse(xml_file)\n            root = tree.getroot()\n            for member in root.findall('object'):\n                value = (root.find('filename').text,\n                         int(root.find('size')[0].text),\n                         int(root.find('size')[1].text),\n                         member[0].text,\n                         int(member[5][0].text),\n                         int(member[5][1].text),\n                         int(member[5][2].text),\n                         int(member[5][3].text)\n                         )\n                xml_list.append(value)\n        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']\n        xml_df = pd.DataFrame(xml_list, columns=column_name)\n        return xml_df  \n ```\n\n\u003cb\u003eTo\u003c/b\u003e\n```python\n    import os\n    import glob\n    import pandas as pd\n    import xml.etree.ElementTree as ET\n\n    def xml_to_csv(path):\n        xml_list = []\n        for xml_file in glob.glob(path + '/*.xml'):\n            tree = ET.parse(xml_file)\n            root = tree.getroot()\n            for member in root.findall('object'):\n                value = (root.find('filename').text,\n                         int(root.find('size')[0].text),\n                         int(root.find('size')[1].text),\n                         member[0].text,\n                         int(float(member[1][0].text)),\n                         int(float(member[1][1].text)),\n                         int(float(member[1][2].text)),\n                         int(float(member[1][3].text))\n                         )\n                xml_list.append(value)\n        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']\n        xml_df = pd.DataFrame(xml_list, columns=column_name)\n        return xml_df\n```\n\nBasically ```   int(member[5][1].text)  ``` is changed to ```   int(float(member[1][0].text))   ``` \u003c/br\u003e\nThe reason is: \u003c/br\u003e\u003c/br\u003e\n\n\u003cul\u003e\n\u003cli\u003eIn licence_detection annotation file the [bndbox] element was present at \u003cb\u003e\u003ci\u003esixth\u003c/i\u003e\u003c/b\u003e position inside [object] element, whereas in the other annotation file it is present in \u003cb\u003e\u003ci\u003esecond\u003c/i\u003e\u003c/b\u003e position.\u003c/li\u003e\n    \n\u003cli\u003eIn licence_detection annotation file the contents of [bndbox] element were of \u003cb\u003e\u003ci\u003eint\u003c/i\u003e\u003c/b\u003e type whereas in the other one it is of \u003cb\u003e\u003ci\u003efloat\u003c/i\u003e\u003c/b\u003e type.\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/br\u003e\u003c/br\u003e\n    \n#### 2. In create_tfrecords.py \n\n\u003cul\u003e\u003cli\u003eAdded dict at line: 32 \u003c/br\u003e\n    \n    index_to_label = {1: 'car', 2:'pool'}\nbecause unlike in the licence_detect annotation file we dont have class name as text in here so we need to change it to text from int\u003c/li\u003e\n\n\u003cli\u003eChanged line: 66\u003c/br\u003e \nfrom \n    \n    classes_text.append(row['class'].encode('utf8'))\nto\n\n    classes_text.append(index_to_label[row['class']].encode('utf8'))\n\nfor the same reason\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/br\u003e\n\u003c/br\u003e\n\n### Directory Changes\nSome source directory needs to be changed depending upon the folder structure of training data images which can be easily figured out when some error will be shown in Colab\n\u003c/br\u003e\u003c/br\u003e\n\n### Running the Code\nAfter making all these changes we are good to go and can proceed to run the code without issues. \u003c/br\u003e\n\nI made the exact same changes and prepared another Colab notebook for the above dataset here: \u003ca href=\"/satellite_car_pool.ipynb\"\u003esatellite_car_pool.ipynb\u003c/a\u003e\n\n### Result of training\n\u003cp\u003e\u003cimg src=\"/images/out2.png\" width=300/\u003e        \u003cimg src=\"/images/out3.png\" width=300/\u003e\u003c/p\u003e\n\n\nThe detection is not that good but also remember that this is the result of just an hour of training and also, you can see cars are getting detected pretty well but pools aren't. The reason being that the number of images with Pool is far less than images with Cars\n\n    train['class'].value_counts()\n    \n    Output:\n    1    11069\n    2     2677\n    Name: class, dtype: int64\nHere 1 = Car, 2 = Pool\nAs can be seen above there are 11069 marked Cars in the training dataset whereas only 2677 Pools and this is called as \u003ca href=\"https://www.analyticsvidhya.com/blog/2021/06/5-techniques-to-handle-imbalanced-data-for-a-classification-problem/\"\u003eImbalanced Dataset\u003c/a\u003e. Though it is not a severe case of imbalance here are an article on \u003ca href=\"https://towardsdatascience.com/having-an-imbalanced-dataset-here-is-how-you-can-solve-it-1640568947eb\"\u003ehow you can fix it.\u003c/a\u003e \n\u003c/br\u003e\u003c/br\u003e\n\n## Additional Changes / Tuning\nRemember \u003ca href=\"/pipeline.config\"\u003epipeline.config\u003c/a\u003e? This is the file which decides a model's configuration. Every model downloaded from \u003ca href=\"https://github.com/tensorflow/models/blob/master/research/object_detection/g3doc/tf2_detection_zoo.md\"\u003eModel Zoo\u003c/a\u003e will have this file which can be edited to re-train the model as required.\n\n\u003cul\u003e\n    \n    model {\n    ssd {\n        num_classes: 1\n\u003cli\u003e\u003ch4\u003enum_classes :\u003c/h4\u003e It is a setting of the number to classify. It is written relatively at the top of the config file.\u003c/li\u003e\n \n    train_config: {\n        batch_size: 32\n        num_steps: 5000\n        optimizer {\n            momentum_optimizer: {\n                learning_rate: {\n                    cosine_decay_learning_rate {\n                        total_steps: 5000\n                        warmup_steps: 1000\n    }\n\u003cli\u003e\u003ch4\u003ebatch_size :\u003c/h4\u003e  This value is often the value of 2 to the nth power as is customary in the field of machine learning. And, the larger this value is, the more load is applied during learning, and depending on the environment, the process may die and not learn. The more the value the more RAM it will consume. \u003c/li\u003e\n    \n\u003cli\u003e\u003ch4\u003enum_steps :\u003c/h4\u003e The number of steps to learn. More the value better the model will train and more is the time required for training\u003c/li\u003e\n    \n\u003cli\u003e\u003ch4\u003etotal_steps and warmup_steps :\u003c/h4\u003e I am investigating because it is an item that was not in the config of other models, total_steps must be greater than or equal to warmup_steps. (If this condition is not met, an error will occur and learning will not start.)\u003c/li\u003e\n    \n\u003c/br\u003e\u003c/br\u003eIf you want In-depth knowledge of each configuration in pipeline.config \u003ca href=\"https://neptune.ai/blog/tensorflow-object-detection-api-best-practices-to-training-evaluation-deployment\"\u003e Here it is \u003c/a\u003e\n    \n\u003c/br\u003e\u003c/br\u003e\n## Choosing your model\nChoice of model to perform transfer learning upon is the key for best results. \u003c/br\u003e\nOur data here was an average of 400px X 400px in licence dataset wheraes it was 224px X 224px for satellite_car_pool dataset. The base model I chose here was trained on images resized to 320px X 320px so this was perfect for their training.\u003c/br\u003e\n\nNow suppose you want to train a dataset of high res images say 1920px X 1080px. Training them on a model trained with 320X320 wont give excellent results.\n\u003c/br\u003eWhen you go to the Model Zoo every model has their size written with their name. Choose the nearest one to your dataset average size.\n\n\n\u003ch3 align=center\u003eThats all folks, go ahead and train your first Model!!\u003c/h3\u003e\n\n    \n    \n    \n    \n    \n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzafarrehan%2Ftensorflow_transfer_learning","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzafarrehan%2Ftensorflow_transfer_learning","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzafarrehan%2Ftensorflow_transfer_learning/lists"}