{"id":15441434,"url":"https://github.com/stsievert/salmon","last_synced_at":"2025-09-02T04:38:13.831Z","repository":{"id":38069329,"uuid":"221306375","full_name":"stsievert/salmon","owner":"stsievert","description":"A tool to collect triplet queries","archived":false,"fork":false,"pushed_at":"2024-06-07T03:48:32.000Z","size":124453,"stargazers_count":8,"open_issues_count":0,"forks_count":2,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-04-11T11:50:13.307Z","etag":null,"topics":["active-learning","crowdsourcing","embedding","machine-learning","triplet-loss","triplets"],"latest_commit_sha":null,"homepage":"https://docs.stsievert.com/salmon/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"bsd-3-clause","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/stsievert.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":"CONTRIBUTING.md","funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-11-12T20:32:16.000Z","updated_at":"2025-04-01T08:47:13.000Z","dependencies_parsed_at":"2024-05-29T20:48:55.851Z","dependency_job_id":"66d80100-4028-4559-8380-69bb6b87c701","html_url":"https://github.com/stsievert/salmon","commit_stats":null,"previous_names":[],"tags_count":87,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stsievert%2Fsalmon","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stsievert%2Fsalmon/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stsievert%2Fsalmon/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/stsievert%2Fsalmon/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/stsievert","download_url":"https://codeload.github.com/stsievert/salmon/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248457268,"owners_count":21106883,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["active-learning","crowdsourcing","embedding","machine-learning","triplet-loss","triplets"],"created_at":"2024-10-01T19:20:30.646Z","updated_at":"2025-04-16T12:30:44.290Z","avatar_url":"https://github.com/stsievert.png","language":"Python","readme":"## Salmon\n\u003ca style=\"border-width:0\" href=\"https://doi.org/10.21105/joss.04517\"\u003e\n  \u003cimg src=\"https://joss.theoj.org/papers/10.21105/joss.04517/status.svg\" alt=\"DOI badge\" \u003e\n\u003c/a\u003e\n\u003cbr\u003e\n\n\u003ca href=\"https://github.com/stsievert/salmon/actions/workflows/test.yml\"\u003e\n \u003cimg src=\"https://github.com/stsievert/salmon/actions/workflows/test.yml/badge.svg?branch=master\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/stsievert/salmon/actions/workflows/test_pip.yml\"\u003e\n \u003cimg src=\"https://github.com/stsievert/salmon/actions/workflows/test_pip.yml/badge.svg?branch=master\" /\u003e\n\u003c/a\u003e\n\u003ca href=\"https://github.com/stsievert/salmon/actions/workflows/test_offline.yml\"\u003e\n  \u003cimg src=\"https://github.com/stsievert/salmon/actions/workflows/test_offline.yml/badge.svg?branch=master\" /\u003e\n\u003c/a\u003e\n\nSalmon is a tool for efficiently generating ordinal embeddings. It relies on\n\"active\" machine learning algorithms to choose the most informative queries for\nhumans to answer.\n\n### Documentation\n\nThis documentation is available at these locations:\n\n- **Primary source**: https://docs.stsievert.com/salmon/\n- Secondary source: as [a raw PDF][pdf] (and as a [slower loading PDF][blobpdf]).\n- Secondary source: as [zipped HTML directory][ziphtml], which requires unzipping the directory\n  then opening up `index.html`.\n\n[pdf]:https://github.com/stsievert/salmon/raw/gh-pages/salmon.pdf\n[blobpdf]:https://github.com/stsievert/salmon/blob/gh-pages/salmon.pdf\n[ziphtml]:https://github.com/stsievert/salmon/archive/refs/heads/gh-pages.zip\n\nPlease [file an issue][issue] if you can not access the documentation.\n\n[issue]:https://github.com/stsievert/salmon/issues/new\n\n### Running Salmon offline\nVisit the documentation at https://docs.stsievert.com/salmon/offline.html.\nBriefly, this should work:\n\n``` shell\n$ cd path/to/salmon\n$ conda env create -f salmon.lock.yml\n$ conda activate salmon\n(salmon) $ pip install -e .\n```\n\nThe documentation online mentions more about how to generate an embedding\noffline: https://docs.stsievert.com/salmon/offline.html#generate-embeddings\n\nWith this, it's also possible to create a script that uses and imports Salmon:\n\n``` python\nfrom salmon.triplets.samplers import TSTE\nimport numpy as np\n\nn, d = 85, 2\nsampler = TSTE(n=n, d=d)\n\nem_init = np.array([[i, -i] for i in range(n)])\nsampler.opt.initialize(embedding=em_init)\n\nqueries, scores, meta = sampler.get_queries(num=10_000)\n```\n\nThis script allows the data scientist to score queries for an embedding they\nspecify.\n\n[semver]:https://semver.org\n","funding_links":[],"categories":[],"sub_categories":[],"project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstsievert%2Fsalmon","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fstsievert%2Fsalmon","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fstsievert%2Fsalmon/lists"}