{"id":18852596,"url":"https://github.com/quva-lab/lang-tracker","last_synced_at":"2025-04-14T10:11:28.799Z","repository":{"id":69447564,"uuid":"102493533","full_name":"QUVA-Lab/lang-tracker","owner":"QUVA-Lab","description":null,"archived":false,"fork":false,"pushed_at":"2021-03-30T08:34:29.000Z","size":34240,"stargazers_count":38,"open_issues_count":6,"forks_count":9,"subscribers_count":11,"default_branch":"master","last_synced_at":"2025-03-27T23:23:50.996Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/QUVA-Lab.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2017-09-05T14:44:26.000Z","updated_at":"2025-02-15T13:31:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"8e1153f0-c36b-4fb7-8415-c53400e7a732","html_url":"https://github.com/QUVA-Lab/lang-tracker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QUVA-Lab%2Flang-tracker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QUVA-Lab%2Flang-tracker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QUVA-Lab%2Flang-tracker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/QUVA-Lab%2Flang-tracker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/QUVA-Lab","download_url":"https://codeload.github.com/QUVA-Lab/lang-tracker/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248860209,"owners_count":21173342,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-08T03:40:41.475Z","updated_at":"2025-04-14T10:11:28.787Z","avatar_url":"https://github.com/QUVA-Lab.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tracking by Natural Language Specification\n![Image](http://isis-data.science.uva.nl/zhenyang/cvpr17-langtracker/images/model.png)\n\nThis repository contains the code for the following paper:\n\n* Z. Li, R. Tao, E. Gavves, C. G. M. Snoek, A. W. M. Smeulders, *Tracking by Natural Language Specification*, in Computer Vision and Pattern Recognition (CVPR), 2017 ([PDF](http://openaccess.thecvf.com/content_cvpr_2017/papers/Li_Tracking_by_Natural_CVPR_2017_paper.pdf))\n```\n@article{li2017cvpr,\n  title={Tracking by Natural Language Specification},\n  author={Li, Zhenyang and Tao, Ran and Gavves, Efstratios and Snoek, Cees G. M. and Smeulders, Arnold W. M.},\n  journal={Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition},\n  year={2017}\n}\n```\n\n# Download Dataset\n### Lingual Lingual OTB99 [Sentences](http://isis-data.science.uva.nl/zhenyang/cvpr17-langtracker/data/OTB_sentences.zip)\n\n### Lingual ImageNet [Sentences](http://isis-data.science.uva.nl/zhenyang/cvpr17-langtracker/data/ImageNet_sentences.zip)\n\nPlease note that we use all the frames from original [OTB100](http://cvlab.hanyang.ac.kr/tracker_benchmark/datasets.html) dataset in our OTB99 videos, while for [ImageNet](http://vision.cs.unc.edu/ilsvrc2015/ui/vid) videos we may only select a subsequence (see start/end frames we selected for each video in `train.txt` or `test.txt`).\n\n# How to use the demo code\n\n## Download and setup Caffe (our own branch)\n\n1. Caffe branch [here](https://github.com/zhenyangli/caffe-lang-track/tree/langtrackV3) (Note: `langtrackV3` branch not `master` branch)\n2. Compile Caffe with option \n```\nWITH_PYTHON_LAYER = 1\n```\n\n## Download pre-trained models\n\n1. Download natural language segmentation model [caffemodel](http://isis-data.science.uva.nl/zhenyang/cvpr17-langtracker/code/pretrain-models/snapshots/lang_high_res_seg/_iter_25000.caffemodel)\nand copy to `MAIN_PATH/snapshots/lang_high_res_seg/_iter_25000.caffemodel`\n\n2. Download tracking model [caffemodel](http://isis-data.science.uva.nl/zhenyang/cvpr17-langtracker/code/pretrain-models/VGG16.v2.caffemodel)\nand copy to `MAIN_PATH/VGG16.v2.caffemodel`\n\n## Run demo code\n\n### ipython notebook code\n\nHere we first demostrate how the model II in the paper works with example videos:\n\n1. Given an image and a natural language query, how to identify a target (applied on the first query frame of a video only)\n```\ndemo/lang_seg_demo.ipynb\n```\n\n2. Given a visual target (a box identified from step 1) and a sequence of frames, how to track the object in all the frames\n```\ndemo/lang_track_demo.ipynb\n```\n\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquva-lab%2Flang-tracker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fquva-lab%2Flang-tracker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fquva-lab%2Flang-tracker/lists"}