Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter

fast.ai starter kit for Google Landmark Retrieval 2019 challenge
https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter

cnn fastai kaggle pytorch retrieval

Last synced: 17 days ago
JSON representation

fast.ai starter kit for Google Landmark Retrieval 2019 challenge

Awesome Lists containing this project

README

        

# Google Landmark Retrieval 2019 Competition fast.ai Starter Pack

The code here is all you need to do the first submission to the [Google Landmark Retrieval 2019 Competition](https://www.kaggle.com/c/landmark-retrieval-2019). It is based on [FastAi library](https://github.com/fastai/fastai) release 1.0.47 and borrows helpser code from great [cnnimageretrieval-pytorch](https://github.com/filipradenovic/cnnimageretrieval-pytorch) library. The latter gives much better results than code in the repo, but not ready-to-make submission and takes 3 days to converge compared to 45 min here.

## Making first submission
1. Install the [fastai library](https://github.com/fastai/fastai), specifically version 1.0.47.

2. Install the [faiss library](https://github.com/facebookresearch/faiss).
conda install faiss-gpu cudatoolkit=9.0 -c pytorch-y

3. Clone this repository.

4. Start the download process for the data. It would take a lot, so in mean time you can run the code.

5. Because the code here does not depend on competition data for training, only for submission.

## Notebooks

1. [download-and-create-microtrain](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/download-and-create-microtrain.ipynb) - download all the aux data for training and validation
2. [validation-no-training](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/validation-no-training.ipynb) - playing with pretrained networks and setting up validation procedure
3. [training-validate-bad](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/training-validate-bad.ipynb) - training DenseNet121 on created micro-train in 45 min and playing with post-processing. It works as described, but just because of pure luck: lots of different "subclusters" == labels are depicting the same landmark. So, do not use it for training of all 19k subclusters
3. [training-validate-good-full](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/training-validate-good-full.ipynb) - Instead, use "clusters" as a labels, it gives much better results.
4. [submission-trained](https://github.com/ducha-aiki/google-retrieval-challenge-2019-fastai-starter/blob/master/submission-trained.ipynb) - creating a first submission. Warning, this could take a lot (~4-12 hours) because of the dataset size