Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/rishit-dagli/astroformer
This repository contains the official implementation of Astroformer, an ICLR Workshop 2023 paper.
https://github.com/rishit-dagli/astroformer
computer-vision convolutional-neural-networks deep-learning machine-learning transformer vision-transformer
Last synced: 4 days ago
JSON representation
This repository contains the official implementation of Astroformer, an ICLR Workshop 2023 paper.
- Host: GitHub
- URL: https://github.com/rishit-dagli/astroformer
- Owner: Rishit-dagli
- License: apache-2.0
- Created: 2023-11-05T01:25:14.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2023-11-05T06:52:52.000Z (about 1 year ago)
- Last Synced: 2024-11-02T06:53:39.521Z (11 days ago)
- Topics: computer-vision, convolutional-neural-networks, deep-learning, machine-learning, transformer, vision-transformer
- Language: Python
- Homepage: https://arxiv.org/abs/2304.05350
- Size: 2.34 MB
- Stars: 20
- Watchers: 2
- Forks: 3
- Open Issues: 6
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Astroformer
This repository contains the official implementation of Astroformer, an ICLR Workshop 2023 paper. This model is aimed at detection tasks in the low-data regimes and achieves SoTA results on CIFAR-100, Tiny Imagenet, and science tasks like Galaxy10 DECals, and competetive performance on CIFAR-10 _without any additional labelled or unlabelled data_.
_**Accompanying paper: [Astroformer: More Data Might not be all you need for Classification](https://arxiv.org/abs/2304.05350)**_ [![arXiv](https://img.shields.io/badge/paper-arXiv:2304.05350-b31b1b.svg?logo=arxiv)](https://arxiv.org/abs/2304.05350)
## Code Overview
The most important code is in `astroformer.py`. We trained Astroformers using the `timm` framework, which we copied from [here](https://github.com/huggingface/pytorch-image-models).
Inside `pytorch-image-models`, we have made the following modifications. (Though one could look at the diff, we think it is convenient to summarize them here.)
- added `timm/models/astroformer.py`
- modified `timm/models/__init__.py`## Training
If you had a node with 8 GPUs, you could train a Astroformer 5 as follows (these are exactly the settings we used for Galaxy10 DECals as well):
```sh
sh distributed_train.sh 8 [/path/to/dataset]
--train-split [your_train_dir]
--val-split [your_val_dir]
--model astroformer_5
--num-classes 10
--img-size 256
--in-chans 3
--input-size 3 256 256
--batch-size 256
--grad-accum-steps 1
--opt adamw
--sched cosine
--lr-base 2e-5
--lr-cycle-decay 1e-2
--lr-k-decay 1
--warmup-lr 1e-5
--epochs 300
--warmup-epochs 5
--mixup 0.8
--smoothing 0.1
--drop 0.1
--save-images
--amp
--amp-impl apex
--output result_ours/astroformer_5_galaxy10
--log-wandb
```You could simply use the same script with the other Astrofromer models: `astroformer_0`, `astroformer_1`, `astroformer_2`, `astroformer_3`, `astroformer_4`, and `astroformer_5` to train those variants as well.
## Main Results
### CIFAR-100
| Model Name | Top-1 Accuracy | FLOPs | Params |
|--------------|----------------|-------|--------|
| Astroformer-3| 87.65 | 31.36 | 161.95 |
| Astroformer-4| 93.36 | 60.54 | 271.68 |
| Astroformer-5| 89.38 | 115.97| 655.34 |### CIFAR-10
| Model Name | Top-1 Accuracy | FLOPs | Params |
|--------------|----------------|-------|--------|
| Astroformer-3| 99.12 | 31.36 | 161.75 |
| Astroformer-4| 98.93 | 60.54 | 271.54 |
| Astroformer-5| 93.23 | 115.97| 655.04 |### Tiny Imagenet
| Model Name | Top-1 Accuracy | FLOPs | Params |
|--------------|----------------|-------|--------|
| Astroformer-3| 86.86 | 24.84 | 150.39 |
| Astroformer-4| 91.12 | 40.38 | 242.58 |
| Astroformer-5| 92.98 | 89.88 | 595.55 |### Galaxy10 DECals
| Model Name | Top-1 Accuracy | FLOPs | Params |
|--------------|----------------|-------|--------|
| Astroformer-3| 92.39 | 31.36 | 161.75 |
| Astroformer-4| 94.86 | 60.54 | 271.54 |
| Astroformer-5| 94.81 | 105.9 | 681.25 |## Citation
If you use this work, please cite the following paper:
BibTeX:
```bibtex
@article{dagli2023astroformer,
title={Astroformer: More Data Might Not be All You Need for Classification},
author={Dagli, Rishit},
journal={arXiv preprint arXiv:2304.05350},
year={2023}
}
```MLA:
```
Dagli, Rishit. "Astroformer: More Data Might Not be All You Need for Classification." arXiv preprint arXiv:2304.05350 (2023).
```## Credits
The code is heavily adapted from [timm](https://github.com/huggingface/pytorch-image-models).