Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter
fast.ai starter kit for Google Landmark Retrieval 2019 challenge
https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter
cnn fastai kaggle pytorch retrieval
Last synced: 17 days ago
JSON representation
fast.ai starter kit for Google Landmark Retrieval 2019 challenge
- Host: GitHub
- URL: https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter
- Owner: ducha-aiki
- Created: 2019-04-15T13:58:18.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2019-04-19T11:22:08.000Z (over 5 years ago)
- Last Synced: 2024-10-04T11:58:08.143Z (about 1 month ago)
- Topics: cnn, fastai, kaggle, pytorch, retrieval
- Language: Jupyter Notebook
- Size: 2.74 MB
- Stars: 61
- Watchers: 4
- Forks: 19
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-fastai - Google Retrieval Challenge 2019
- awesome-fastai - Google Retrieval Challenge 2019
README
# Google Landmark Retrieval 2019 Competition fast.ai Starter Pack
The code here is all you need to do the first submission to the [Google Landmark Retrieval 2019 Competition](https://www.kaggle.com/c/landmark-retrieval-2019). It is based on [FastAi library](https://github.com/fastai/fastai) release 1.0.47 and borrows helpser code from great [cnnimageretrieval-pytorch](https://github.com/filipradenovic/cnnimageretrieval-pytorch) library. The latter gives much better results than code in the repo, but not ready-to-make submission and takes 3 days to converge compared to 45 min here.
## Making first submission
1. Install the [fastai library](https://github.com/fastai/fastai), specifically version 1.0.47.2. Install the [faiss library](https://github.com/facebookresearch/faiss).
conda install faiss-gpu cudatoolkit=9.0 -c pytorch-y3. Clone this repository.
4. Start the download process for the data. It would take a lot, so in mean time you can run the code.
5. Because the code here does not depend on competition data for training, only for submission.
## Notebooks
1. [download-and-create-microtrain](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/download-and-create-microtrain.ipynb) - download all the aux data for training and validation
2. [validation-no-training](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/validation-no-training.ipynb) - playing with pretrained networks and setting up validation procedure
3. [training-validate-bad](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/training-validate-bad.ipynb) - training DenseNet121 on created micro-train in 45 min and playing with post-processing. It works as described, but just because of pure luck: lots of different "subclusters" == labels are depicting the same landmark. So, do not use it for training of all 19k subclusters
3. [training-validate-good-full](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/training-validate-good-full.ipynb) - Instead, use "clusters" as a labels, it gives much better results.
4. [submission-trained](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/submission-trained.ipynb) - creating a first submission. Warning, this could take a lot (~4-12 hours) because of the dataset size