{"id":19611931,"url":"https://github.com/princeton-nlp/datamux","last_synced_at":"2026-03-06T03:32:17.107Z","repository":{"id":54677088,"uuid":"458319066","full_name":"princeton-nlp/DataMUX","owner":"princeton-nlp","description":"[NeurIPS 2022] DataMUX: Data Multiplexing for Neural Networks","archived":false,"fork":false,"pushed_at":"2022-11-24T07:47:39.000Z","size":3117,"stargazers_count":60,"open_issues_count":0,"forks_count":9,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-04-27T22:33:46.664Z","etag":null,"topics":["deep-learning","pytorch"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/princeton-nlp.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2022-02-11T19:42:01.000Z","updated_at":"2024-12-18T04:34:31.000Z","dependencies_parsed_at":"2023-01-22T07:00:49.506Z","dependency_job_id":null,"html_url":"https://github.com/princeton-nlp/DataMUX","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/princeton-nlp/DataMUX","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/princeton-nlp%2FDataMUX","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/princeton-nlp%2FDataMUX/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/princeton-nlp%2FDataMUX/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/princeton-nlp%2FDataMUX/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/princeton-nlp","download_url":"https://codeload.github.com/princeton-nlp/DataMUX/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/princeton-nlp%2FDataMUX/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30160873,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T22:39:40.138Z","status":"online","status_checked_at":"2026-03-06T02:00:08.268Z","response_time":250,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["deep-learning","pytorch"],"created_at":"2024-11-11T10:45:07.388Z","updated_at":"2026-03-06T03:32:17.071Z","avatar_url":"https://github.com/princeton-nlp.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"## DataMUX ##\n\nPyTorch implementation for the paper:\n\n**[DataMUX: Data Multiplexing for Neural Networks](https://princeton-nlp.github.io/DataMUX/)**  \n[Vishvak Murahari](https://vishvakmurahari.com/), [Carlos E. Jimenez](https://www.carlosejimenez.com/), [Runzhe Yang](https://runzhe-yang.science/), [Karthik Narasimhan](https://www.cs.princeton.edu/~karthikn/)\n\n![models](images/multiplexing.gif)\n\nThis repository contains code for reproducing results. We provide pretrained model weights and associated configs to run inference or train these models from scratch. If you find this work useful in your research, please cite:\n\n```\n@inproceedings{\nmurahari2022datamux,\ntitle={Data{MUX}: Data Multiplexing for Neural Networks},\nauthor={Vishvak Murahari and Carlos E Jimenez and Runzhe Yang and Karthik R Narasimhan},\nbooktitle={Thirty-Sixth Conference on Neural Information Processing Systems},\nyear={2022},\nurl={https://openreview.net/forum?id=UdgtTVTdswg}\n}\n```\n\n### Table of Contents\n\n   * [Setup and Dependencies](#setup-and-dependencies)\n   * [Usage](#usage)\n      * [Overview](#Overview)\n      * [Pre-trained checkpoints](#pre-trained-checkpoints)\n      * [Training settings](#settings)\n      * [Vision Tasks](#vision)\n   * [Reference](#reference)\n   * [License](#license)\n\n### Setup and Dependencies\n\nOur code is implemented in PyTorch. To setup, do the following:\n\n1. Install [Python 3.6](https://www.python.org/downloads/release/python-365/)\n2. Get the source:\n```\ngit clone https://github.com/princeton-nlp/DataMUX.git datamux\n```\n3. Install requirements into the `datamux` virtual environment, using [Anaconda](https://anaconda.org/anaconda/python):\n```\nconda env create -f env.yaml\n```\n\n### Usage\n\n#### Overview\nFor sentence-level classification tasks, refer to `run_glue.py` and `run_glue.sh`. For token-level classification tasks, refer to `run_ner.py` and `run_ner.sh`.\n#### Pre-trained checkpoints\nWe release all the pretrained checkpoints on the Hugging Face [model hub](https://huggingface.co/princeton-nlp). We list the checkpoints below. For number of instances, use 2, 5, 10, 20 or 40.\n\n| Task            | Model name on hub | Full path |\n| ----------------|:-------------------|---------:\n| Retrieval Warmup| datamux-retrieval-\u003cnum_instances\u003e | princeton-nlp/datamux-retrieval-\u003cnum_instances\u003e|\n| MNLI            | datamux-mnli-\u003cnum_instances\u003e      | princeton-nlp/datamux-mnli-\u003cnum_instances\u003e|\n| QNLI            | datamux-qnli-\u003cnum_instances\u003e      | princeton-nlp/datamux-qnli-\u003cnum_instances\u003e|\n| QQP             | datamux-qqp-\u003cnum_instances\u003e       | princeton-nlp/datamux-qqp-\u003cnum_instances\u003e|\n| SST2            | datamux-sst2-\u003cnum_instances\u003e      | princeton-nlp/datamux-sst2-\u003cnum_instances\u003e|\n| NER             | datamux-ner-\u003cnum_instances\u003e      | princeton-nlp/datamux-ner-\u003cnum_instances\u003e|\n\n#### Settings\nThe bash scripts `run_ner.sh` and `run_glue.sh` take the following arguments:\n\n\n| Argument      | Flag | Explanation                  |Argument Choices |\n| ------------- |:-----|-----------------------------:|-----------------|\n| NUM_INSTANCES | -N --num_instances | Number of multiplexing instances | 2,5,10,20,40 |\n| DEMUXING      | -d --demuxing      | Demultiplexing architecture| \"index\", \"mlp\" \n| MUXING        | -m --muxing        | Multiplexing architecture | \"gaussian_hadamard\", \"binary_hadamard\", \"random_ortho\"|\n| SETTING       | -s --setting       | Training setting | \"baseline\", \"finetuning\", \"retrieval_pretraining\"|\n| TASK_NAME     | --task             | Task name during finetuning | \"mnli\", \"qnli\", \"sst2\", \"qqp\" for `run_glue.py` or \"ner\" for `run_ner.py` \n| LEARNING_RATE | --lr               | Learning rate for optimization| Any float but we use either 2e-5 or 5e-5|\n| BATCH_SIZE    | --batch_size       | Batch size (after multiplexing); note that the *effective* batch size is BATCH_SIZE * NUM_INSTANCES | Any integer. If left unset, will be set automatically based on value of N|\n| CONFIG_NAME   | --config_name      | Config path for backbone Transformer Model| Any config file in `configs` directory\n| MODEL_PATH    | --model_path       | Model path if either continuing to train from a checkpoint or initialize from retrieval task pretrained checkpoint| Path to local checkpoint or path to model on the [hub](https://huggingface.co/princeton-nlp)\n| LEARN_MUXING  | --learn_muxing | Whether to learn instance embeddings in multiplexing| |\n| DO_TRAIN      | --do_train | Pass flag to do training | |\n| DO_EVAL       | --do_eval  | Pass flag to do eval | |\n\nBelow we list exemplar commands for different training settings:\n\n#### Retrieval pretraining\nThis commands runs retrieval pretraining for N=2\n```\nsh run_glue.sh \\\n   -N 2 \\\n   -d index \\\n   -m gaussian_hadamard \\\n   -s retrieval_pretraining \\\n   --config_name configs/ablations/base_model/roberta.json \\\n   --lr 5e-5 \\\n   --do_train \\\n   --do_eval\n```\n\n#### Finetuning\nThis command finetunes from a retrieval pretrained checkpoint with N=2\n```\nsh run_glue.sh \\\n   -N 2 \\\n   -d index \\\n   -m gaussian_hadamard \\\n   -s finetuning \\\n   --config_name configs/ablations/base_model/roberta.json \\\n   --lr 5e-5 \\\n   --task mnli \\\n   --model_path princeton-nlp/datamux-retrieval-2 \\\n   --do_train \\\n   --do_eval\n```\n\nSimilar, to run token-level classification tasks like NER, change `run_glue.sh` to `run_ner.sh`\n```\nsh run_ner.sh \\\n   -N 2 \\\n   -d index \\\n   -m gaussian_hadamard \\\n   -s finetuning \\\n   --config_name configs/ablations/base_model/roberta.json \\\n   --lr 5e-5 \\\n   --task ner \\\n   --model_path princeton-nlp/datamux-retrieval-2 \\\n   --do_train \\\n   --do_eval \n```\n\n#### Baselines\nFor the non-multiplexed baselines, run the following commnands\n```\nsh run_glue.sh \\\n-N 1 \\\n-s baseline \\\n--config_name configs/ablations/base_model/roberta.json \\\n--lr 2e-5 \\\n--task mnli\n```\n\n#### Vision\nFor reproducing results on the vision tasks for MLPs and CNNs, please use this [notebook](https://github.com/princeton-nlp/DataMUX/blob/main/vision/vision_multiplexing.ipynb)\n\n### Reference\n```\n@inproceedings{\nmurahari2022datamux,\ntitle={Data{MUX}: Data Multiplexing for Neural Networks},\nauthor={Vishvak Murahari and Carlos E Jimenez and Runzhe Yang and Karthik R Narasimhan},\nbooktitle={Thirty-Sixth Conference on Neural Information Processing Systems},\nyear={2022},\nurl={https://openreview.net/forum?id=UdgtTVTdswg}\n}\n```\n### License\nCheck `LICENSE.md`\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprinceton-nlp%2Fdatamux","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fprinceton-nlp%2Fdatamux","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fprinceton-nlp%2Fdatamux/lists"}