{"id":13626865,"url":"https://github.com/layumi/University1652-Baseline","last_synced_at":"2025-04-16T19:30:55.750Z","repository":{"id":38410267,"uuid":"219441774","full_name":"layumi/University1652-Baseline","owner":"layumi","description":"ACM Multimedia2020 University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization :helicopter: annotates 1652 buildings in 72 universities around the world.","archived":false,"fork":false,"pushed_at":"2025-04-08T06:05:25.000Z","size":131971,"stargazers_count":527,"open_issues_count":30,"forks_count":80,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-04-12T16:12:29.602Z","etag":null,"topics":["awesome-list","cross-view","cvact","cvusa","dataset","drone","gem-pooling","geo-localization","image-retrieval","multi-source-benchmark","place-recognition","pytorch","remote-sensing","satellite","uav"],"latest_commit_sha":null,"homepage":"https://arxiv.org/abs/2002.12186","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/layumi.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2019-11-04T07:27:05.000Z","updated_at":"2025-04-12T09:53:09.000Z","dependencies_parsed_at":"2024-04-06T13:40:46.088Z","dependency_job_id":null,"html_url":"https://github.com/layumi/University1652-Baseline","commit_stats":null,"previous_names":[],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/layumi%2FUniversity1652-Baseline","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/layumi%2FUniversity1652-Baseline/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/layumi%2FUniversity1652-Baseline/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/layumi%2FUniversity1652-Baseline/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/layumi","download_url":"https://codeload.github.com/layumi/University1652-Baseline/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":249268547,"owners_count":21240940,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["awesome-list","cross-view","cvact","cvusa","dataset","drone","gem-pooling","geo-localization","image-retrieval","multi-source-benchmark","place-recognition","pytorch","remote-sensing","satellite","uav"],"created_at":"2024-08-01T22:00:23.764Z","updated_at":"2025-04-16T19:30:50.741Z","avatar_url":"https://github.com/layumi.png","language":"Python","funding_links":[],"categories":["Visual Localization","产品和项目","Code","Datasets"],"sub_categories":["Unmanned Underwater Vehicles","Drone Frames","视觉定位"],"readme":"\u003ch1 align=\"center\"\u003e University1652-Baseline \u003c/h1\u003e\n\n![Python 3.6+](https://img.shields.io/badge/python-3.6+-green.svg)\n[![License: MIT](https://img.shields.io/badge/License-MIT-green.svg)](https://opensource.org/licenses/MIT)\n\n[![VideoDemo](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/youtube1.png)](https://www.youtube.com/embed/dzxXPp8tVn4?vq=hd1080)\n\n[[Paper]](https://arxiv.org/abs/2002.12186) \n[[Slide]](http://zdzheng.xyz/files/ACM-MM-Talk.pdf)\n[[Explore Drone-view Data]](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/sample_drone.jpg?raw=true)\n[[Explore Satellite-view Data]](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/sample_satellite.jpg?raw=true)\n[[Explore Street-view Data]](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/sample_street.jpg?raw=true)\n[[Video Sample]](https://www.youtube.com/embed/dzxXPp8tVn4?vq=hd1080)\n[[中文介绍]](https://zhuanlan.zhihu.com/p/110987552)\n[[Building Name List]](https://github.com/layumi/University1652-Baseline/blob/master/new_name_list.txt)\n[[Latitude and Longitude]](https://drive.google.com/file/d/1PL8fVky9KZg7XESsuS5NCsYRyYAwui3S/view?usp=sharing)\n[[Flight Path]](https://drive.google.com/file/d/1EW5Esi72tPcfL3zmoHYpufKj_SXrY-xE/view?usp=sharing)\n\n![](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/Data.jpg)\n\n![](https://github.com/layumi/University1652-Baseline/blob/master/docs/index_files/Motivation.png)\n\n\n### Download [University-1652] upon request (Usually I will reply you in 5 minutes). You may use the request [template](https://github.com/layumi/University1652-Baseline/blob/master/Request.md).\n\nThis repository contains the dataset link and the code for our paper [University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization](https://arxiv.org/abs/2002.12186), ACM Multimedia 2020. The offical paper link is at https://dl.acm.org/doi/10.1145/3394171.3413896. We collect 1652 buildings of 72 universities around the world. Thank you for your kindly attention.\n\n**Task 1: Drone-view target localization.** (Drone -\u003e Satellite) Given one drone-view image or video, the task aims to find the most similar satellite-view image to localize the target building in the satellite view. \n\n**Task 2: Drone navigation.** (Satellite -\u003e Drone) Given one satellite-view image, the drone intends to find the most relevant place (drone-view images) that it has passed by. According to its flight history, the drone could be navigated back to the target place.\n\n\n### 1. ACM MM UAVM Workshop\n**23 Apr 2024** We will hold the 2nd workshop on ACM MM 2024! Please see [https://www.zdzheng.xyz/ACMMM2024Workshop-UAV/](https://www.zdzheng.xyz/ACMMM2024Workshop-UAV/) for reference.\n\n### 2. ACM ICMR Workshop\n\n**12 Jan 2024** We are holding a workshop at ACM ICMR 2024 on  Multimedia Object Re-ID. You are welcome to show your insights. See you at Phuket, Thailand!😃 The workshop link is https://www.zdzheng.xyz/MORE2024/ . Submission DDL is **15 April 2024**.\n\n\u003cdetails\u003e\n \u003csummary\u003e\u003cb\u003e\n  2023 Workshop and Sepcial Session\n\u003c/b\u003e\u003c/summary\u003e\n\n### 1.  IEEE ITSC Special Session\nWe host a special session on IEEE Intelligent Transportation Systems Conference (ITSC), covering the object re-identification \u0026 point cloud topic. The paper ddl is by **May 15, 2023** and the paper notification is at June 30, 2023. Please select the session code ``w7r4a'' during submission. More details can be found at [Special Session Website](https://2023.ieee-itsc.org/wp-content/uploads/2023/03/IEEE-ITSC-2023-Special-Session-Proposal-Safe-Critical-Scenario-Understanding-in-Intelligent-Transportation-Systems-SCSU-ITS.pdf).  \n\n### 2. Remote Sensing Special Issue\nWe raise a special issue on Remote Sensing (IF=5.3) from now to ~~**16 June 2023**~~ **16 Dec 2023**. You are welcomed to submit your manuscript at (https://www.mdpi.com/journal/remotesensing/special_issues/EMPK490239), but you need to keep open-source fee in mind.\n\n### 3. ACM Multimedia Workshop\nWe are holding the workshop at ACM Multimedia 2023 on Aerial-view Imaging. [Call for papers](https://www.zdzheng.xyz/ACMMM2023Workshop/) [中文介绍](https://zhuanlan.zhihu.com/p/620180604)\n\n### 4. Coda Lab Challenge\nWe also provide a challenging cross-view geo-localization dataset, called University160k, and the workshop audience may consider to participate the competition. The motivation is to simulate the real- world geo-localization scenario that we usually face an extremely large satellite-view pool. In particular, University160k extends the current University-1652 dataset with extra 167,486 satellite- view gallery distractors. We have release University160k on the challenge page, and made a public leader board.\n(More details are at https://codalab.lisn.upsaclay.fr/competitions/12672)\n\n\u003c/details\u003e\n\n## Table of contents\n* [About Dataset](#about-dataset)\n* [News](#news)\n* [Code Features](#code-features)\n* [Prerequisites](#prerequisites)\n* [Getting Started](#getting-started)\n    * [Installation](#installation)\n    * [Dataset Preparation](#dataset--preparation)\n    * [Train Evaluation ](#train--evaluation)\n    * [Trained Model](#trained--model)\n* [Citation](#citation)\n\n## About Dataset\nThe dataset split is as follows: \n| Split | #imgs | #buildings | #universities|\n| --------   | -----  | ----| ----|\n|Training | 50,218 | 701 | 33 |\n| Query_drone | 37,855 | 701 |  39 |\n| Query_satellite | 701 | 701 | 39|\n| Query_ground | 2,579 | 701 | 39|\n| Gallery_drone | 51,355 | 951 | 39|\n| Gallery_satellite |  951 | 951 | 39|\n| Gallery_ground | 2,921 | 793  | 39|\n\nMore detailed file structure:\n```\n├── University-1652/\n│   ├── readme.txt\n│   ├── train/\n│       ├── drone/                   /* drone-view training images \n│           ├── 0001\n|           ├── 0002\n|           ...\n│       ├── street/                  /* street-view training images \n│       ├── satellite/               /* satellite-view training images       \n│       ├── google/                  /* noisy street-view training images (collected from Google Image)\n│   ├── test/\n│       ├── query_drone/  \n│       ├── gallery_drone/  \n│       ├── query_street/  \n│       ├── gallery_street/ \n│       ├── query_satellite/  \n│       ├── gallery_satellite/ \n│       ├── 4K_drone/\n```\n\nWe note that there are no overlaps between 33 univeristies of training set and 39 univeristies of test set.\n\n## News\n\n**2 Jul 2024** Text-guided Geo-localization is accepted by ECCV 2024 (https://arxiv.org/pdf/2311.12751).\n\n**26 Jan 2023** 1652 Building Name List is at [Here](https://github.com/layumi/University1652-Baseline/blob/master/new_name_list.txt).\n\n**10 Jul 2022** Rainy？Night？Foggy？ Snow？ You may check our new paper \"Multiple-environment Self-adaptive Network for Aerial-view Geo-localization\" at https://github.com/wtyhub/MuseNet (accepted by Pattern Recognition'24)  \n\n**1 Dec 2021** Fix the issue due to the latest torchvision, which do not allow the empty subfolder. Note that some buildings do not have google images.  \n\n**3 March 2021** [GeM Pooling](https://cmp.felk.cvut.cz/~radenfil/publications/Radenovic-arXiv17a.pdf) is added. You may use it by `--pool gem`.\n\n**21 January 2021** The GPU-Re-Ranking,  a GNN-based real-time post-processing code, is at [Here](GPU-Re-Ranking/).\n\n**21 August 2020** The transfer learning code for Oxford and Paris is at [Here](https://github.com/layumi/cnnimageretrieval-pytorch/blob/master/cirtorch/examples/test_My1652model.py).\n\n**27 July 2020** The meta data of 1652 buildings, such as latitude and longitude, are now available at [Google Driver](https://drive.google.com/file/d/1PL8fVky9KZg7XESsuS5NCsYRyYAwui3S/view?usp=sharing). (You could use Google Earth Pro to open the kml file or use vim to check the value).  \nWe also provide the spiral flight tour file at [Google Driver](https://drive.google.com/file/d/1EW5Esi72tPcfL3zmoHYpufKj_SXrY-xE/view?usp=sharing). (You could open the kml file via Google Earth Pro to enable the flight camera).  \n\n**26 July 2020** The paper is accepted by ACM Multimedia 2020.\n\n**12 July 2020** I made the baseline of triplet loss (with soft margin) on University-1652 public available at [Here](https://github.com/layumi/University1652-triplet-loss).\n\n**12 March 2020** I add the [state-of-the-art](https://github.com/layumi/University1652-Baseline/tree/master/State-of-the-art) page for geo-localization and [tutorial](https://github.com/layumi/University1652-Baseline/tree/master/tutorial), which will be updated soon.\n\n## Code Features\nNow we have supported:\n- Float16 to save GPU memory based on [apex](https://github.com/NVIDIA/apex)\n- Multiple Query Evaluation\n- Re-Ranking\n- Random Erasing\n- ResNet/VGG-16\n- Visualize Training Curves\n- Visualize Ranking Result\n- Linear Warm-up \n\n## Prerequisites\n\n- Python 3.6+\n- GPU Memory \u003e= 8G\n- Numpy \u003e 1.12.1\n- Pytorch 0.3+ \n- [Optional] apex (for float16) \n\n## Getting started\n### Installation\n- Install Pytorch from http://pytorch.org/\n- Install required packages\n```bash\npip install -r requirement.txt\n```\n- [Optinal] You may skip it. Install apex from the source\n```bash\ngit clone https://github.com/NVIDIA/apex.git\ncd apex\npython setup.py install --cuda_ext --cpp_ext\n```\n- [Optinal] Usually it comes with pytorch. Install Torchvision from the source (Please check the README. Or directly install by anaconda. It will be Okay.)\n```bash\ngit clone https://github.com/pytorch/vision # Please check the version to match Pytorch.\ncd vision\npython setup.py install\n```\n\n## Dataset \u0026 Preparation\nDownload [University-1652] upon request. You may use the request [template](https://github.com/layumi/University1652-Baseline/blob/master/Request.md).\n\nOr download [CVUSA](http://cs.uky.edu/~jacobs/datasets/cvusa/) / [CVACT](https://github.com/Liumouliu/OriCNN). \n\nFor CVUSA, I follow the training/test split in (https://github.com/Liumouliu/OriCNN). \n\n## Train \u0026 Evaluation \n### Train \u0026 Evaluation University-1652\n```\npython train.py --name three_view_long_share_d0.75_256_s1_google  --extra --views 3  --droprate 0.75  --share  --stride 1 --h 256  --w 256 --fp16; \npython test.py --name three_view_long_share_d0.75_256_s1_google\n```\n\nDefault setting: Drone -\u003e Satellite\nIf you want to try other evaluation setting, you may change these lines at: https://github.com/layumi/University1652-Baseline/blob/master/test.py#L217-L225 \n\n### Ablation Study only Satellite \u0026 Drone\n```\npython train_no_street.py --name two_view_long_no_street_share_d0.75_256_s1  --share --views 3  --droprate 0.75  --stride 1 --h 256  --w 256  --fp16; \npython test.py --name two_view_long_no_street_share_d0.75_256_s1\n```\nSet three views but set the weight of loss on street images to zero.\n\n### Train \u0026 Evaluation CVUSA\n```\npython prepare_cvusa.py\npython train_cvusa.py --name usa_vgg_noshare_warm5_lr2 --warm 5 --lr 0.02 --use_vgg16 --h 256 --w 256  --fp16 --batchsize 16;\npython test_cvusa.py  --name usa_vgg_noshare_warm5_lr2 \n```\n\n### Show the retrieved Top-10 result \n```\npython test.py --name three_view_long_share_d0.75_256_s1_google # after test\npython demo.py --query_index 0 # which image you want to query in the query set \n```\nIt will save an image named `show.png' containig top-10 retrieval results in the folder. \n\n## Trained Model\n\nYou could download the trained model at [GoogleDrive](https://drive.google.com/open?id=1iES210erZWXptIttY5EBouqgcF5JOBYO) or [OneDrive](https://studentutsedu-my.sharepoint.com/:u:/g/personal/12639605_student_uts_edu_au/EW19pLps66RCuJcMAOtWg5kB6Ux_O-9YKjyg5hP24-yWVQ?e=BZXcdM). After download, please put model folders under `./model/`.\n\n## Citation\nThe following paper uses and reports the result of the baseline model. You may cite it in your paper.\n```bibtex\n@article{zheng2020university,\n  title={University-1652: A Multi-view Multi-source Benchmark for Drone-based Geo-localization},\n  author={Zheng, Zhedong and Wei, Yunchao and Yang, Yi},\n  journal={ACM Multimedia},\n  year={2020}\n}\n@inproceedings{zheng2023uavm,\n  title={UAVM'23: 2023 Workshop on UAVs in Multimedia: Capturing the World from a New Perspective},\n  author={Zheng, Zhedong and Shi, Yujiao and Wang, Tingyu and Liu, Jun and Fang, Jianwu and Wei, Yunchao and Chua, Tat-seng},\n  booktitle={Proceedings of the 31st ACM International Conference on Multimedia},\n  pages={9715--9717},\n  year={2023}\n}\n```\nInstance loss is defined in \n```bibtex\n@article{zheng2017dual,\n  title={Dual-Path Convolutional Image-Text Embeddings with Instance Loss},\n  author={Zheng, Zhedong and Zheng, Liang and Garrett, Michael and Yang, Yi and Xu, Mingliang and Shen, Yi-Dong},\n  journal={ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM)},\n  doi={10.1145/3383184},\n  volume={16},\n  number={2},\n  pages={1--23},\n  year={2020},\n  publisher={ACM New York, NY, USA}\n}\n```\n## Related Work\n- Instance Loss [Code](https://github.com/layumi/Image-Text-Embedding)\n- Person re-ID from Different Viewpoints [Code](https://github.com/layumi/Person_reID_baseline_pytorch)\n- Lending Orientation to Neural Networks for Cross-view Geo-localization [Code](https://github.com/Liumouliu/OriCNN)\n- Predicting Ground-Level Scene Layout from Aerial Imagery [Code](https://github.com/viibridges/crossnet)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flayumi%2FUniversity1652-Baseline","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flayumi%2FUniversity1652-Baseline","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flayumi%2FUniversity1652-Baseline/lists"}