https://github.com/sayannath/convmixer-tensorflow

Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras
https://github.com/sayannath/convmixer-tensorflow

attention cifar100 convmixer convolution deep-learning iclr iclr2022 machine-learning mlp patches tensorflow vision-transformer

Last synced: 9 months ago
JSON representation

Implementation of ConvMixer-Patches Are All You Need? in TensorFlow and Keras

Host: GitHub
URL: https://github.com/sayannath/convmixer-tensorflow
Owner: sayannath
License: mit
Created: 2021-10-15T19:45:26.000Z (over 4 years ago)
Default Branch: main
Last Pushed: 2021-10-31T18:04:10.000Z (over 4 years ago)
Last Synced: 2025-02-26T12:22:52.646Z (12 months ago)
Topics: attention, cifar100, convmixer, convolution, deep-learning, iclr, iclr2022, machine-learning, mlp, patches, tensorflow, vision-transformer
Language: Python
Homepage: https://openreview.net/pdf?id=TVHS5Y4dNvM
Size: 1.75 MB
Stars: 12
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Patches Are All You Need? - ConvMixer

ConvMixer, an extremely simple model that is similar in spirit to the ViT and the even-more-basic MLP-Mixer in that it operates directly on patches as input, separates the mixing of spatial and channel dimensions, and maintains equal size and resolution throughout the network. In contrast, however, the ConvMixer uses only standard convolutions to achieve the mixing steps. Despite its simplicity, we show that the ConvMixer outperforms the ViT, MLP-Mixer, and some of their variants for similar parameter counts and data set sizes, in addition to outperforming classical vision models such as the ResNet.

**Official GitHub Link:** https://github.com/tmp-iclr/convmixer

**Paper Link:** https://openreview.net/pdf?id=TVHS5Y4dNvM

Note: Paper is under review for ICLR 2022

## Model Architechture

![](https://i.imgur.com/Yd7gpMP.png)

## Installation

```
pip install -q tensorflow-addons
```

Note: We are using TensorFlow-Addons for using the `AdamW` optimizer and `GeLU` activation function.

## Results

![Unknown-2](https://user-images.githubusercontent.com/41967348/137559060-96c6c84a-7055-4f3d-ade1-415e5a756880.png) ![Unknown](https://user-images.githubusercontent.com/41967348/137559078-0f095bd4-e119-457c-ac79-7caa5e9a076e.png)

> TensorBoard Link: https://tensorboard.dev/experiment/bkhqOz0RQ1Cv5dwrDQySMQ/

Note: Trained `25 Epochs` and got a top-5-accuracy of 64.41%

## Future Work

* To train on 150 epochs
* To train model on ImageNet dataset

## Citation
```
@inproceedings{
anonymous2022patches,
title={Patches Are All You Need?},
author={Anonymous},
booktitle={Submitted to The Tenth International Conference on Learning Representations },
year={2022},
url={https://openreview.net/forum?id=TVHS5Y4dNvM},
note={under review}
}
```

## License
```
MIT License

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
```

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/sayannath/convmixer-tensorflow

Awesome Lists containing this project

README