{"id":26879817,"url":"https://github.com/ituvtu/deep_learning-classification-swin-v2-b","last_synced_at":"2025-03-31T13:33:22.917Z","repository":{"id":283519517,"uuid":"952035721","full_name":"ituvtu/Deep_Learning-Classification-Swin-v2-B","owner":"ituvtu","description":null,"archived":false,"fork":false,"pushed_at":"2025-03-20T16:41:02.000Z","size":20,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-20T17:33:47.619Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/ituvtu.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-03-20T16:21:22.000Z","updated_at":"2025-03-20T16:41:06.000Z","dependencies_parsed_at":"2025-03-20T17:43:57.609Z","dependency_job_id":null,"html_url":"https://github.com/ituvtu/Deep_Learning-Classification-Swin-v2-B","commit_stats":null,"previous_names":["ituvtu/deep_learning-classification-swin-v2-b"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ituvtu%2FDeep_Learning-Classification-Swin-v2-B","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ituvtu%2FDeep_Learning-Classification-Swin-v2-B/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ituvtu%2FDeep_Learning-Classification-Swin-v2-B/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/ituvtu%2FDeep_Learning-Classification-Swin-v2-B/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/ituvtu","download_url":"https://codeload.github.com/ituvtu/Deep_Learning-Classification-Swin-v2-B/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246474306,"owners_count":20783455,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-03-31T13:33:22.128Z","updated_at":"2025-03-31T13:33:22.884Z","avatar_url":"https://github.com/ituvtu.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# README: Swin-V2-Base for AquaMonitor JYU Dataset\n\n## Overview\n\nThis project focuses on the automatic classification of aquatic macroinvertebrates using deep learning to enhance environmental biomonitoring. Two neural network architectures were used: **Swin-V2-Base** and **ResNet18**. **Swin-V2-Base** achieved a **77% weighted F1-score**. This is an academic project aimed at evaluating the effectiveness of modern models in macroinvertebrate classification.\n\n## Project Goal\n\nAccurate monitoring of aquatic macroinvertebrates is crucial for assessing water quality and biodiversity. Manual identification requires significant time and specialized knowledge, making it difficult to scale. **Deep learning** can automate this process, improving both speed and accuracy.\n\n## Importance and Relevance\n\n- **Ecological Context**: Insect biomass has declined by **75%** over the past 30 years, making monitoring more crucial than ever.\n- **Challenges of Manual Identification**: High costs, a limited number of experts, and scalability issues.\n- **Deep Learning Advancements**: Modern models achieve expert-level accuracy under laboratory conditions.\n\n## Dataset Description\n\n**AquaMonitor JYU** is a subset of the large AquaMonitor dataset, which contains images of aquatic macroinvertebrates.\n\n- **Number of Classes**: 31\n- **Training Set**: 40,880 images (from 1,049 individuals)\n- **Validation Set**: 6,394 images (from 157 individuals)\n- **Test Set**: Hidden\n- **Image Format**: 256x256\n\n### Class Examples\nTo better understand the task's complexity, below are sample images from all 31 classes:\n![Class examples](https://github.com/user-attachments/assets/8c8a13dc-c154-4f0f-b195-13c596fcdb39)\n\n### Class Imbalance\nThe dataset exhibits **significant class imbalance**, with the number of images per class ranging from **400 to 3,500**. This poses challenges during model training, as rare classes may lack sufficient representation for effective generalization.\n\nBelow is a histogram showing the distribution of classes in the training set:\n![The distribution of classes](https://github.com/user-attachments/assets/31caed39-1a8b-4117-bcb0-9a2bcee67960)\n\n## Swin-V2-Base Architecture\n\nSwin-V2-Base is a transformer-based architecture that utilizes **hierarchical representation** and **local windows**, allowing it to efficiently process high-resolution images. However, the model showed **signs of overfitting**, emphasizing the importance of **pretraining** and regularization techniques.\n\n### Model Configuration\n- **Image Size**: 256x256\n- **Number of Classes**: 31\n- **Optimizer**: AdamW\n- **Loss Function**: CrossEntropyLoss with label smoothing = 0.1\n- **Epochs**: 6\n- **Batch Size**: 64\n- **Max Learning Rate**: 5e-5\n- **Regularization**: Drop Path Rate = 0.2\n\n### Weight Freezing\nTo improve model training:\n- **Patch embedding layer weights were frozen**\n- **Parameters of the first two layers were frozen**\n\n### Training Setup\n- **OneCycleLR** was used for adaptive learning rate scheduling.\n- Model was trained on **A100 GPU in Google Colab** with **FP16 mixed precision**.\n- Gradient scaling was performed using **torch.amp.GradScaler**.\n\n### Model download\nThe Swin-V2-Base model can be downloaded at the following link:\n[Download model.pt](https://www.dropbox.com/scl/fi/imlg8647aogsg0qzvwsv4/model.pt?rlkey=6t8y91cs6727ec4zsb935kit9\u0026st=u89yv675\u0026dl=0)\n\n\n### Data Augmentation\nTwo augmentation strategies were applied:\n1. **Standard augmentation for all images**:\n   - Random rotation (10°)\n   - Color jitter (brightness, contrast, saturation, hue)\n   - Random resized cropping\n   - Gaussian blur\n   - Random affine transformations\n2. **Stronger augmentation for rare classes**:\n   - Random horizontal flip\n   - Random perspective distortion\n   - Stronger brightness and contrast adjustments\n\n### Model Files\nThe repository contains:\n- **model.pt** – The trained model checkpoint\n- **model.py** – The model class for loading the trained model\n\n## Results\n\n| Architecture  | Weighted F1-score |\n| ------------ | ----------------- |\n| Swin-V2-Base | **77%**           |\n| ResNet18     | 74%               |\n\nThe **Swin-V2-Base** model exhibited **overfitting tendencies**, whereas ResNet18 had lower performance overall.\n\n\n## Conclusions\n\n- **Swin-V2-Base achieved a 77% weighted F1-score**, but overfitting was observed.\n- **Transfer learning is crucial**, as using pretrained models significantly improves results.\n- **Possible improvements**: Increasing generalization by **data augmentation**, **regularization**, and alternative training strategies.\n\n## License\n\nThis project is an academic study and is intended for research purposes only.\n\n# Deep_Learning-Classification-Swin-v2-B\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fituvtu%2Fdeep_learning-classification-swin-v2-b","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fituvtu%2Fdeep_learning-classification-swin-v2-b","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fituvtu%2Fdeep_learning-classification-swin-v2-b/lists"}