{"id":20068290,"url":"https://github.com/pliang279/sparse_discrete","last_synced_at":"2025-10-14T11:13:11.993Z","repository":{"id":77809135,"uuid":"260585594","full_name":"pliang279/sparse_discrete","owner":"pliang279","description":"[ICLR 2021] Anchor \u0026 Transform: Learning Sparse Embeddings for Large Vocabularies","archived":false,"fork":false,"pushed_at":"2021-05-04T07:20:41.000Z","size":6117,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-08-11T02:36:26.506Z","etag":null,"topics":["deep-learning","efficient-neural-networks","machine-learning","natural-language-processing","recommender-system","sparse-representations"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/pliang279.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-05-02T00:31:04.000Z","updated_at":"2022-03-18T14:34:59.000Z","dependencies_parsed_at":"2023-03-12T02:09:19.837Z","dependency_job_id":null,"html_url":"https://github.com/pliang279/sparse_discrete","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/pliang279/sparse_discrete","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsparse_discrete","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsparse_discrete/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsparse_discrete/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsparse_discrete/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/pliang279","download_url":"https://codeload.github.com/pliang279/sparse_discrete/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/pliang279%2Fsparse_discrete/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279019061,"owners_count":26086512,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-14T02:00:06.444Z","response_time":60,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","efficient-neural-networks","machine-learning","natural-language-processing","recommender-system","sparse-representations"],"created_at":"2024-11-13T14:06:00.741Z","updated_at":"2025-10-14T11:13:11.972Z","avatar_url":"https://github.com/pliang279.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Anchor \u0026 Transform: Learning Sparse Embeddings for Large Vocabularies\n\n\u003e Pytorch implementation for Anchor \u0026 Transform: Learning Sparse Embeddings for Large Vocabularies\n\nCorrespondence to: \n  - Paul Liang (pliang@cs.cmu.edu)\n  - Manzil Zaheer (manzilzaheer@google.com)\n\n## Paper\n\n[**Anchor \u0026 Transform: Learning Sparse Embeddings for Large Vocabularies**](https://arxiv.org/abs/2003.08197)\u003cbr\u003e\n[Paul Pu Liang](http://www.cs.cmu.edu/~pliang/), [Manzil Zaheer](http://www.manzil.ml/), [Yuan Wang](https://ai.google/research/people/YuanWang), [Amr Ahmed](https://ai.google/research/people/AmrAhmed)\u003cbr\u003e\nICLR 2021\n\nIf you find this repository useful, please cite our paper:\n```\n@inproceedings{liang2021anchor,\n  author    = {Paul Pu Liang and\n               Manzil Zaheer and\n               Yuan Wang and\n               Amr Ahmed},\n  title     = {Anchor \u0026 Transform: Learning Sparse Embeddings for Large Vocabularies},\n  booktitle = {9th International Conference on Learning Representations, {ICLR} 2021},\n  publisher = {OpenReview.net},\n  year      = {2021},\n  url       = {https://openreview.net/forum?id=Vd7lCMvtLqg}\n}\n```\n\n## Installation\n\nFirst check that the requirements are satisfied:\u003c/br\u003e\nPython 3.6\u003c/br\u003e\ntorch 1.2.0\u003c/br\u003e\nnumpy 1.18.1\u003c/br\u003e\nmatplotlib 3.1.2\u003c/br\u003e\ntqdm 4.45.0\u003c/br\u003e\n\nThe next step is to clone the repository:\n```bash\ngit clone https://github.com/pliang279/sparse_discrete.git\n```\n\n## Data\n\n### Movielens data\n\ndownload Movielens 25m data from http://files.grouplens.org/datasets/movielens/ml-25m.zip and unzip into a folder ml-25m/\n\ndownload Movielens 1m data from http://files.grouplens.org/datasets/movielens/ml-1m.zip and unzip into a folder ml-1m/\n\nrun ```python3 movielens_data.py``` which extracts the .dat files in ml-1m/ and generates ml-1m/ml1m_ratings.csv\n\nby now, make sure you have the files ```ml-25m/ratings.csv``` and ```ml-1m/ml1m_ratings.csv```\n\n### Amazon review data\n\ndownload amazon data from http://deepyeti.ucsd.edu/jianmo/amazon/categoryFilesSmall/all_csv_files.csv into a folder called amazon_data/\n\nrun ```python3 movielens_data.py```, which parses the .csv files in amazon_data/ and generates the file ```amazon_data/saved_amazon_data_filtered5.h5```\n\n## Instructions\n\n### Movielens data\n\nMF baseline: ```python3 movielens.py --model_path MF --latent_dim 16 --dataset 25m```\n\nMixDim embeddings: ```python3 movielens.py --model_path mdMF --base_dim 16 --temperature 0.4 --k 8 --dataset 25m```\n\nANT: ```python3 movielens.py --model_path sparseMF --latent_dim 16 --user_anchors 10 --item_anchors 15 --lda2s 2e-6 --lda2e 2e-6 --dataset 25m```\n\nNBANT: ```python3 movielens.py --model_path sparseMF --latent_dim 16 --lda1 0.1 --lda2s 2e-6 --lda2e 2e-6 --dataset 25m --dynamic```\n\n### Amazon review data\n\nMF: ```python3 amazon.py --model_path MF --latent_dim 16 --dataset amazon```\n\nANT: ```python3 amazon.py --model_path sparseMF --latent_dim 16 --user_anchors 8 --item_anchors 8 --lda2s 1e-7 --lda2e 1e-7 --dataset amazon```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpliang279%2Fsparse_discrete","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpliang279%2Fsparse_discrete","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpliang279%2Fsparse_discrete/lists"}