{"id":28720126,"url":"https://github.com/bkj/tagless","last_synced_at":"2025-10-26T10:35:08.893Z","repository":{"id":53256157,"uuid":"99046626","full_name":"bkj/tagless","owner":"bkj","description":"Interface for building image classifiers, via transfer learning, active search and uncertainty sampling","archived":false,"fork":false,"pushed_at":"2023-03-08T20:10:18.000Z","size":2165,"stargazers_count":2,"open_issues_count":5,"forks_count":1,"subscribers_count":3,"default_branch":"master","last_synced_at":"2023-08-01T10:38:07.180Z","etag":null,"topics":["active-learning","active-search","human-computer-interaction","labeling-tool","python","rest-api"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bkj.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-08-01T21:59:33.000Z","updated_at":"2023-08-01T10:38:07.180Z","dependencies_parsed_at":"2023-01-21T17:15:12.448Z","dependency_job_id":null,"html_url":"https://github.com/bkj/tagless","commit_stats":null,"previous_names":[],"tags_count":0,"template":null,"template_full_name":null,"purl":"pkg:github/bkj/tagless","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkj%2Ftagless","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkj%2Ftagless/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkj%2Ftagless/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkj%2Ftagless/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bkj","download_url":"https://codeload.github.com/bkj/tagless/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bkj%2Ftagless/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":259929997,"owners_count":22933537,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning","active-search","human-computer-interaction","labeling-tool","python","rest-api"],"created_at":"2025-06-15T06:07:05.395Z","updated_at":"2025-10-26T10:35:08.798Z","avatar_url":"https://github.com/bkj.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"## tagless\n\nTagging interface w/ transfer learning, linearized active search and uncertainty sampling:\n\n|                   |                              | \n| ----------------- | ---------------------------- |\n| Transfer learning | https://github.com/bkj/tdesc |\n| Linearized active search (LAS) |  https://github.com/bkj/simple_las | \n| Uncertainty sampling | https://github.com/bkj/libact | \n\nUnder active development -- some things are broken or don't have sensible APIs exposed.\n\n### Usage\n\n```\n\n    cd $TARGET_DIR\n    mkdir -p ./{data,results}\n    # expect set of images to be in `imgs` directory\n    \n    # Featurize images\n    find ./imgs/ -type f | python -m tdesc --model vgg16 --crow \u003e .crow\n    \n    # Prep + reformat images\n    python $TAGLESS_ROOT/tagless/prep.py --inpath .crow ./data/crow\n    \n    # Run server\n    python -m tagless --outpath ./results/my-labels --crow ./data/crow\n    \n    # Connect to localhost:5000 + start tagging\n```\n\n### Notes\n\nUncertainty sampling computes the score for each unlabeled image at each iteration.  ATM we're using a linear SVM, so the runtime of this step increases linearly w/ the size of the corpus.  On my machine, predicting on ~350K images takes ~2.5s, which is unacceptably slow.  Thus, for big corpora, we may want to fall back to some kind of approximate matrix-vector product. That'll take a little bit of thought thought.  For now I'll recommend running on a subset of the data.\n\n__Idea__: Feature vectors are normalized relus -- so norm=1 and all positive entries.  Could maybe do uncertainty sampling via `faiss` by using vector orthogonal to SVM feature vector and take the largest/smallest entries.  Have to check my work on that one though.\n\n### Dependencies\n\nThis has been tested on Ubuntu 16.04 w/ Python 2.7 (via Anaconda)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbkj%2Ftagless","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbkj%2Ftagless","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbkj%2Ftagless/lists"}