{"id":23335863,"url":"https://github.com/shuhao02/RouterDC","last_synced_at":"2025-08-23T04:32:53.636Z","repository":{"id":264132631,"uuid":"844418917","full_name":"shuhao02/RouterDC","owner":"shuhao02","description":"The code of RouterDC","archived":false,"fork":false,"pushed_at":"2024-12-02T14:21:14.000Z","size":11971,"stargazers_count":36,"open_issues_count":1,"forks_count":2,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-02T15:29:24.983Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/shuhao02.png","metadata":{"files":{"readme":"readme.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-08-19T08:11:45.000Z","updated_at":"2024-12-02T14:21:19.000Z","dependencies_parsed_at":"2024-11-22T07:40:54.834Z","dependency_job_id":null,"html_url":"https://github.com/shuhao02/RouterDC","commit_stats":null,"previous_names":["shuhao02/routerdc"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shuhao02%2FRouterDC","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shuhao02%2FRouterDC/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shuhao02%2FRouterDC/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/shuhao02%2FRouterDC/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/shuhao02","download_url":"https://codeload.github.com/shuhao02/RouterDC/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":230665554,"owners_count":18261516,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-21T02:01:33.528Z","updated_at":"2024-12-21T02:01:37.359Z","avatar_url":"https://github.com/shuhao02.png","language":"Jupyter Notebook","funding_links":[],"categories":["A01_文本生成_文本对话","2.1 Ensemble Before Inference"],"sub_categories":["大语言对话模型及数据","2.1.1 (a,1) Pre-Trained Router"],"readme":"# (NeurIPS 2024) RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models\n\nShuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, and Yu Zhang\n\n---\n\nOfficial Implementation of NeurIPS 2024 paper \"[RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models](https://arxiv.org/abs/2409.19886)\".\n\n# Quick Start\n\n## Datasets\nWe have provided the necessary training datasets in the [datasets](./datasets) folder.\n\nTo create your own training datasets from scratch, follow these steps:\n\n- **Evaluate LLM Outputs:** Use [EleutherAI/lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) and [bigcode-evaluation-harness\n](https://github.com/bigcode-project/bigcode-evaluation-harness?tab=readme-ov-file#features) to evaluate each language model (LLM). To log the output of each samples, we slightly modify the [bigcode-evaluation-harness](https://github.com/bigcode-project/bigcode-evaluation-harness?tab=readme-ov-file#features) as mention in [issue](https://github.com/bigcode-project/bigcode-evaluation-harness/issues/215#issuecomment-2044445209). The commands to generate the answers for each dataset subset can be found in the [eval_scripts](./eval_scripts) folder.\n- **Prepare the Dataset:** Allocate the scores for each LLM, then merge the scores with the queries to create the training and testing datasets. Detailed instructions can be found in [convert_dataset_7_model.ipynb](convert_dataset_7_model.ipynb).\n- **Assign Cluster IDs:** Allocate cluster IDs for the training dataset by following the process outlined in [cluster_generate.ipynb](src/cluster_generate.ipynb).\n\n## Training\nRefer to the [train_scripts](train_scripts) folder for detailed training instructions.\n\n## Testing\nDuring training, the model automatically evaluates at predefined evaluation steps. \nYou can also manually evaluate a specific checkpoint using [evaluation_router.py](evaluation_router.py).\n\n## Citation\nIf you find RouterDC is useful for your research and applications, please cite using this BibTeX:\n\n```\n@inproceedings{chen2024RouterDC,\n  title={{RouterDC}: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models},\n  author={Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, and Yu Zhang},\n  booktitle={Neural Information Processing Systems},\n  year={2024}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshuhao02%2FRouterDC","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fshuhao02%2FRouterDC","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fshuhao02%2FRouterDC/lists"}