https://github.com/tyleryep/landmark

CS 230 Project
https://github.com/tyleryep/landmark

Last synced: 4 months ago
JSON representation

CS 230 Project

Host: GitHub
URL: https://github.com/tyleryep/landmark
Owner: TylerYep
Created: 2019-04-16T17:07:04.000Z (over 6 years ago)
Default Branch: main
Last Pushed: 2021-01-20T08:04:49.000Z (almost 5 years ago)
Last Synced: 2025-02-09T08:16:58.720Z (11 months ago)
Language: Jupyter Notebook
Size: 12.2 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# Landmark Recognition
#### CS 230 Project

Main Challenge:
https://www.kaggle.com/c/landmark-retrieval-2019/overview

Baseline Model:
https://www.kaggle.com/c/landmark-recognition-challenge/discussion/57919

## Step 1: Install Conda Ennviroment
Run ``` conda env create -f ennviroment.yml ```.

### Step 2: Download Dataset CSV Link
https://www.kaggle.com/c/landmark-retrieval-2019/data
The above link contains csv files with links to all of the images for the train and test sets. Unzip the folder and put it into data/images/, and then specify the number of examples you want to download in const.py. You can also manually change whether you want to download from the train, dev, or test set.

### Step 3: Get Subset of Data
Run ``` python preprocessing/subset-data.py ```.

(Note: everything should be run from the ```landmark/``` level.)

This file outputs a modified ```train-subset.csv``` file to fetch images from. You can specify how many unique landmarks you want and how many of each you want by changing variables in ```const.py```. For our project, we will use 100,000 random images sampled from the full ```train.py``` dataset.

### Step 3: Download Images
Run ``` python download-images.py ```.

Hopefully this doesn't take forever. If you simply want all of the images, use the .sh file or download from a link on the Kaggle page.

## Workflow
Basically run train.py, which currently relies on three places: dataset, const, and layers.

dataset.py
const.py
train.py
test.py
util.py

## To ask TAs:
- Do we still want data augmentation when we have too much training data?

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/tyleryep/landmark

Awesome Lists containing this project

README