https://github.com/sarvesh30112002/bilinear-pooling-with-hierarchical-structure-for-precise-visual-recognition
The project aims to enhance fine-grained visual recognition accuracy. Integrate Hierarchical Bilinear Pooling (HBP) with ResNet architecture. Leverage ResNet's capabilities and pre-trained parameters. Provide accessibility to the codebase for broader research and adaptation.
https://github.com/sarvesh30112002/bilinear-pooling-with-hierarchical-structure-for-precise-visual-recognition
bilinear-pooling fine-grained-visual-categorization hbp hierarchical-bilinear-pooling pytorch resnet-50 spatial-relation-recognition two-step-training-strategy
Last synced: 3 months ago
JSON representation
The project aims to enhance fine-grained visual recognition accuracy. Integrate Hierarchical Bilinear Pooling (HBP) with ResNet architecture. Leverage ResNet's capabilities and pre-trained parameters. Provide accessibility to the codebase for broader research and adaptation.
- Host: GitHub
- URL: https://github.com/sarvesh30112002/bilinear-pooling-with-hierarchical-structure-for-precise-visual-recognition
- Owner: Sarvesh30112002
- License: mit
- Created: 2024-06-05T20:43:42.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2024-06-05T21:11:26.000Z (about 1 year ago)
- Last Synced: 2025-01-20T00:53:42.826Z (5 months ago)
- Topics: bilinear-pooling, fine-grained-visual-categorization, hbp, hierarchical-bilinear-pooling, pytorch, resnet-50, spatial-relation-recognition, two-step-training-strategy
- Language: Python
- Homepage:
- Size: 1.62 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Bilinear-Pooling-with-Hierarchical-Structure-for-Precise-Visual-Recognition
Computer vision is continuously advancing in the field of accurate visual representation. Deep learning architectures such as ResNet50 and pooling algorithms have been crucial to this process. In this study, a pioneering approach called Bilinear Pooling with Hierarchical Structure is introduced to enhance the visual recognition accuracy. Using ResNet50 the method incorporates hierarchical pooling using the ReLU activation function. This new combination improves both feature representation and the accuracy with which complex visual details are captured. Through the hierarchical organization of pooling the layers, multi-scale contextual information is efficiently extracted. Through training, it is shown that the proposed framework is effective in producing notable performance improvements. The results highlight the potential benefits of this method.
-------### Screenshots
![]()
![]()
### Introduction
The project aims to enhance fine-grained visual recognition accuracy.Integrate Hierarchical Bilinear Pooling (HBP) with ResNet architecture.
Leverage ResNet's capabilities and pre-trained parameters.
Provide accessibility to the codebase for broader research and adaptation.
### Objectives
• Improving the efficiency of the model using Two-Step Training Strategy• Increasing accuracy through Spatial Relationship Learning
• Improved Fine-Grained Recognition using hierarchical bilinear pooling
• Hierarchical pooling to capture complex relationships
• Enhancing the feature representation of the model
• Efficiently extracting multi-scale contextual information
• Improving the performance of the model through training
### ARCHITECTURE diagram
![]()
## Requirements
- python 2.7
- pytorch 0.4.1## Train
Step 1.
- Download the resnet pre-training parameters.- Download the CUB-200-2011 dataset.
[CUB-download](http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/CUB_200_2011.tgz)Step 2.
- Set the path to the dataset and resnet parameters in the code.Step 3. Train the fc layer only.
- python train_firststep.py
Step 4. Fine-tune all layers. It gets an accuracy of around 86% on CUB-200-2011 when using resnet-50.
- python train_finetune.py