https://github.com/zafarrehan/tensorflow_transfer_learning

This repository explains how to train any pre-trained tensorflow model and use the existing weights to build your custom model in no time.
https://github.com/zafarrehan/tensorflow_transfer_learning

object-detection opencv-python tensorflow-models tensorflow2

Last synced: 2 months ago
JSON representation

This repository explains how to train any pre-trained tensorflow model and use the existing weights to build your custom model in no time.

Host: GitHub
URL: https://github.com/zafarrehan/tensorflow_transfer_learning
Owner: zafarRehan
License: other
Created: 2022-03-13T12:58:40.000Z (over 3 years ago)
Default Branch: main
Last Pushed: 2022-03-17T05:25:17.000Z (over 3 years ago)
Last Synced: 2025-02-13T18:49:30.880Z (4 months ago)
Topics: object-detection, opencv-python, tensorflow-models, tensorflow2
Language: Jupyter Notebook
Homepage:
Size: 7 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: license_plate_detection.ipynb

Awesome Lists containing this project

README

        # Tensorflow Transfer Learning

This repository explains how to perform transfer learning on any tensorflow pre-trained object-detection model.

Any model listed in Model Zoo can be re-trained using this tutorial.

## Why Transfer Learning?

Training a model to solve real world object detection problems is no easy task. It needs a lot of computing resources and time to train such models from scratch. 

Using transfer learning we can use the existing weights of the pre-trained models and change just the last few layers to customize it to fit our own problem domain. 

These models are probably trained on super-computers which is impossible for many low to medium scale organizations to access or to afford. 

I trained my licence detection model in less than 3 hours on Google Colab and used the output model to detect licence plates on images here: https://github.com/zafarRehan/licence_plate_detection

Now let's jump into using the code.

## Running the default code in Colab

The repository contains the Notebook license_plate_detection.ipynb which can be downloaded and executed directly on Google Colab.

Everything is pre-feeded in the Notebook, from datset to configuration files. 

Just click on Runtime -> Run all then sit back and relax and watch your custom model being built.

The dataset I used here is from Kaggle https://www.kaggle.com/andrewmvd/car-plate-detection which contains 432 annotated images of cars with licence plates.

The code is well-commented so each step is explained in comments in the code.

Output



## Training your own Model

Our main goal here is to train our own Object Detection model with excellent performance and in no time.

First and foremost we need data to train our model on. You can download any annotated dataset from Kaggle, or here or anywhere on the Internet.

You can create your own dataset for object detection for which you must have: 

1. Atleat 300 to 400 images containing the object(s)

2. Annotating tool to draw the bounding boxes of the object(s), for example: https://www.youtube.com/watch?v=Tlvy-eM8YO4 (Recommended)

## Changes to be made for Custom Training

As the problem changes so does varoius other parameters.

In order to demonstrate the changes I will take another example to walk you through the changes to be made and the challenges that can be faced while changin them.

Dataset Used :  https://www.kaggle.com/kbhartiya83/swimming-pool-and-car-detection

This dataset consist of 2 classes:  

1. Car 


2. Swimming Pool

unlike the licence_plate_detection which has only one class  Licence  

To handle this change in number of classes following changes must me made in: 

### custom.pbtxt

Before:

    item

    {

        id :1

        name :'licence'

    }

    

    

    

    

After:

    item

    {

        id :1

        name :'car'

    }

    item

    {

        id: 2

        name: 'pool'

    }

    

Note: The number of item should match number of classes in your dataset with proper name. 

pipeline.config 

at line 3:

change

    num_classes: 1

to

    num_classes: 2

    

Note: The value of num_classes must be equal to number of classes / different objects to be detected in your dataset. (Here: 'Car', 'Pool').

### Annotation Changes

The annotation files can be different for different datasets or your own created annotation.

Let's compare our annotation file for the 2 datasets: 

1. Licence Plate Annotation File

    

        images

        Cars2.png

        

            400

            400

            3

        

        0

        

            licence

            Unspecified

            0

            0

            0

            

                229

                176

                270

                193

            

        

          

   

2. Satellite Car Pool Annotation File

    

    

    

        000000001.jpg

        

            ArcGIS Pro 2.1

        

        

            224

            224

            3

        

        

            1

            

                58.47

                40.31

                69.58

                51.43

            

        

        

            1

            

                10.32

                93.68

                21.43

                104.80

            

        

    

Well there is difference right? 

To handle these changes following code were changed: 

#### 1. In the notebook this code block was changed

From

```python

    import os

    import glob

    import pandas as pd

    import xml.etree.ElementTree as ET

    def xml_to_csv(path):

        xml_list = []

        for xml_file in glob.glob(path + '/*.xml'):

            tree = ET.parse(xml_file)

            root = tree.getroot()

            for member in root.findall('object'):

                value = (root.find('filename').text,

                         int(root.find('size')[0].text),

                         int(root.find('size')[1].text),

                         member[0].text,

                         int(member[5][0].text),

                         int(member[5][1].text),

                         int(member[5][2].text),

                         int(member[5][3].text)

                         )

                xml_list.append(value)

        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']

        xml_df = pd.DataFrame(xml_list, columns=column_name)

        return xml_df  

 ```

To

```python

    import os

    import glob

    import pandas as pd

    import xml.etree.ElementTree as ET

    def xml_to_csv(path):

        xml_list = []

        for xml_file in glob.glob(path + '/*.xml'):

            tree = ET.parse(xml_file)

            root = tree.getroot()

            for member in root.findall('object'):

                value = (root.find('filename').text,

                         int(root.find('size')[0].text),

                         int(root.find('size')[1].text),

                         member[0].text,

                         int(float(member[1][0].text)),

                         int(float(member[1][1].text)),

                         int(float(member[1][2].text)),

                         int(float(member[1][3].text))

                         )

                xml_list.append(value)

        column_name = ['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']

        xml_df = pd.DataFrame(xml_list, columns=column_name)

        return xml_df

```

Basically ```   int(member[5][1].text)  ``` is changed to ```   int(float(member[1][0].text))   ``` 

The reason is: 



In licence_detection annotation file the [bndbox] element was present at sixth position inside [object] element, whereas in the other annotation file it is present in second position.

    

In licence_detection annotation file the contents of [bndbox] element were of int type whereas in the other one it is of float type.



    

#### 2. In create_tfrecords.py 



Added dict at line: 32 

    

    index_to_label = {1: 'car', 2:'pool'}

because unlike in the licence_detect annotation file we dont have class name as text in here so we need to change it to text from int


Changed line: 66 

from 

    

    classes_text.append(row['class'].encode('utf8'))

to

    classes_text.append(index_to_label[row['class']].encode('utf8'))

for the same reason



### Directory Changes

Some source directory needs to be changed depending upon the folder structure of training data images which can be easily figured out when some error will be shown in Colab

### Running the Code

After making all these changes we are good to go and can proceed to run the code without issues. 

I made the exact same changes and prepared another Colab notebook for the above dataset here: satellite_car_pool.ipynb

### Result of training

        


The detection is not that good but also remember that this is the result of just an hour of training and also, you can see cars are getting detected pretty well but pools aren't. The reason being that the number of images with Pool is far less than images with Cars

    train['class'].value_counts()

    

    Output:

    1    11069

    2     2677

    Name: class, dtype: int64

Here 1 = Car, 2 = Pool

As can be seen above there are 11069 marked Cars in the training dataset whereas only 2677 Pools and this is called as Imbalanced Dataset. Though it is not a severe case of imbalance here are an article on how you can fix it. 

## Additional Changes / Tuning

Remember pipeline.config? This is the file which decides a model's configuration. Every model downloaded from Model Zoo will have this file which can be edited to re-train the model as required.



    

    model {

    ssd {

        num_classes: 1



num_classes :
 It is a setting of the number to classify. It is written relatively at the top of the config file.

 

    train_config: {

        batch_size: 32

        num_steps: 5000

        optimizer {

            momentum_optimizer: {

                learning_rate: {

                    cosine_decay_learning_rate {

                        total_steps: 5000

                        warmup_steps: 1000

    }



batch_size :
  This value is often the value of 2 to the nth power as is customary in the field of machine learning. And, the larger this value is, the more load is applied during learning, and depending on the environment, the process may die and not learn. The more the value the more RAM it will consume. 

    



num_steps :
 The number of steps to learn. More the value better the model will train and more is the time required for training

    



total_steps and warmup_steps :
 I am investigating because it is an item that was not in the config of other models, total_steps must be greater than or equal to warmup_steps. (If this condition is not met, an error will occur and learning will not start.)

    

If you want In-depth knowledge of each configuration in pipeline.config  Here it is 

    

## Choosing your model

Choice of model to perform transfer learning upon is the key for best results. 

Our data here was an average of 400px X 400px in licence dataset wheraes it was 224px X 224px for satellite_car_pool dataset. The base model I chose here was trained on images resized to 320px X 320px so this was perfect for their training.

Now suppose you want to train a dataset of high res images say 1920px X 1080px. Training them on a model trained with 320X320 wont give excellent results.

When you go to the Model Zoo every model has their size written with their name. Choose the nearest one to your dataset average size.

Thats all folks, go ahead and train your first Model!!

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/zafarrehan/tensorflow_transfer_learning

Awesome Lists containing this project

README

Output

pipeline.config

num_classes :

batch_size :

num_steps :

total_steps and warmup_steps :

Thats all folks, go ahead and train your first Model!!