{"id":13754222,"url":"https://github.com/Alibaba-NLP/SeqGPT","last_synced_at":"2025-05-09T22:31:30.698Z","repository":{"id":190148469,"uuid":"681001697","full_name":"Alibaba-NLP/SeqGPT","owner":"Alibaba-NLP","description":"SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding","archived":false,"fork":false,"pushed_at":"2023-12-13T13:37:21.000Z","size":732,"stargazers_count":215,"open_issues_count":3,"forks_count":10,"subscribers_count":4,"default_branch":"main","last_synced_at":"2024-11-16T07:33:13.150Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Alibaba-NLP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-21T03:46:42.000Z","updated_at":"2024-11-12T08:59:05.000Z","dependencies_parsed_at":"2023-12-13T14:40:37.755Z","dependency_job_id":null,"html_url":"https://github.com/Alibaba-NLP/SeqGPT","commit_stats":null,"previous_names":["alibaba-nlp/seqgpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FSeqGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FSeqGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FSeqGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FSeqGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Alibaba-NLP","download_url":"https://codeload.github.com/Alibaba-NLP/SeqGPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":253335686,"owners_count":21892713,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-03T09:01:50.614Z","updated_at":"2025-05-09T22:31:30.001Z","avatar_url":"https://github.com/Alibaba-NLP.png","language":"Python","funding_links":[],"categories":["A01_文本生成_文本对话"],"sub_categories":["大语言对话模型及数据"],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/logo.jpg\" width=\"55%\"\u003e\n\n## An Out-of-the-box Large Language Model for Open Domain Sequence Understanding\n\n\u003cdiv\u003e\nTianyu Yu*, Chengyue Jiang*, Chao Lou*, Shen Huang*, Xiaobin Wang, Wei Liu, Jiong Cai, Yangning Li, Yinghui Li, Kewei Tu, Hai-Tao Zheng, Ningyu Zhang, Pengjun Xie, Fei Huang, Yong Jiang†\n\u003c/div\u003e\n\u003cdiv\u003e\n\u003cstrong\u003eDAMO Academy, Alibaba Group\u003c/strong\u003e\n\u003c/div\u003e\n\u003cdiv\u003e\n*Equal Contribution; † Corresponding Author\n\u003c/div\u003e\n\u003c/div\u003e\n\n\u003cdiv align=\"center\"\u003e\n\n[![license](https://img.shields.io/github/license/Alibaba-NLP/SeqGPT)](./LICENSE)\n[![paper](https://img.shields.io/badge/arXiv-2308.10529-red)](https://arxiv.org/abs/2308.10529)\n\n\u003c/div\u003e\n\n## Spotlights\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/overview.jpg\" width=\"85%\"\u003e\n\u003c/div\u003e\n\u003cbr/\u003e\n\n* A bilingual model (English and Chinese) specially enhanced for open-domain NLU.\n* Trained with diverse synthesized data and high-quality NLU dataset.\n* Handle all NLU tasks that can be transformed into a combination of atomic tasks, classification and extraction.\n\n## 📰  Update News\n\n`SeqGPT` is continuously updating. We have provided online demos for everyone. In the future, we will provide new versions of models with upgraded capabilities. Please continue to pay attention!\n\n- **[2023/10/09]** 💪 We provide [API](https://help.aliyun.com/zh/dashscope/developer-reference/opennlu-api-details) of SeqGPT-3B for users who want to access **larger** SeqGPT.\n- **[2023/09/20]** 🎛️ We provide a sample script for fine-tuning on a custom dataset at [here](https://github.com/modelscope/swift/blob/main/examples/pytorch/llm/scripts/seqgpt_560m/full/sft.sh).\n- **[2023/08/23]** 🛠️ We release the weight of SeqGPT-560M at both [Modelscope](https://www.modelscope.cn/models/damo/nlp_seqgpt-560m) and [Hugging Face](https://huggingface.co/DAMO-NLP/SeqGPT-560M). You can download and inference with our model simply following the [usage case](#inference).\n- **[2023/08/23]** 🔥 We provide an online demo of SeqGPT at [Modelscope](https://www.modelscope.cn/studios/TTCoding/open_ner/summary)! Try it now!\n- **[2023/08/21]** 📑 We release the paper of SeqGPT: [SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding](https://arxiv.org/abs/2308.10529). More implementation details and experimental results are presented in the paper.\n\n\n## Performance\n\nWe perform a human evaluation on SeqGPT-7B1 and ChatGPT using the held-out datasets. Ten annotators are tasked to decide which model gives the better answer or two models are tied with each other. SeqGPT-7B1 outperforms ChatGPT on 7/10 NLU tasks but lags behind in sentiment analysis (SA), slot filling (SF) and natural language inference (NLI).\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"assets/human_eval_7b_vs_chatgpt.jpg\" width=\"300\"\u003e\n\u003c/div\u003e\n\n## Usage\n\n### Install\n\n```sh\nconda create -n seqgpt python==3.8.16\n\nconda activate seqgpt\npip install -r requirements.txt\n```\n\n### Inference\n\n```python\nfrom transformers import AutoTokenizer, AutoModelForCausalLM, AutoModel\nimport torch\n\nmodel_name_or_path = 'DAMO-NLP/SeqGPT-560M'\ntokenizer = AutoTokenizer.from_pretrained(model_name_or_path)\nmodel = AutoModelForCausalLM.from_pretrained(model_name_or_path)\ntokenizer.padding_side = 'left'\ntokenizer.truncation_side = 'left'\n\nif torch.cuda.is_available():\n    model = model.half().cuda()\nmodel.eval()\nGEN_TOK = '[GEN]'\n\nwhile True:\n    sent = input('输入/Input: ').strip()\n    task = input('分类/classify press 1, 抽取/extract press 2: ').strip()\n    labels = input('标签集/Label-Set (e.g, labelA,LabelB,LabelC): ').strip().replace(',', '，')\n    task = '分类' if task == '1' else '抽取'\n\n    # Changing the instruction can harm the performance\n    p = '输入: {}\\n{}: {}\\n输出: {}'.format(sent, task, labels, GEN_TOK)\n    input_ids = tokenizer(p, return_tensors=\"pt\", padding=True, truncation=True, max_length=1024)\n    input_ids = input_ids.to(model.device)\n    outputs = model.generate(**input_ids, num_beams=4, do_sample=False, max_new_tokens=256)\n    input_ids = input_ids.get('input_ids', input_ids)\n    outputs = outputs[0][len(input_ids[0]):]\n    response = tokenizer.decode(outputs, skip_special_tokens=True)\n    print('BOT: ========== \\n{}'.format(response))\n```\n\n\n## Citation\n\nIf you found this work useful, consider giving this repository a star and citing our paper as followed:\n\n```\n@misc{yu2023seqgpt,\n      title={SeqGPT: An Out-of-the-box Large Language Model for Open Domain Sequence Understanding}, \n      author={Tianyu Yu and Chengyue Jiang and Chao Lou and Shen Huang and Xiaobin Wang and Wei Liu and Jiong Cai and Yangning Li and Yinghui Li and Kewei Tu and Hai-Tao Zheng and Ningyu Zhang and Pengjun Xie and Fei Huang and Yong Jiang},\n      year={2023},\n      eprint={2308.10529},\n      archivePrefix={arXiv},\n      primaryClass={cs.CL}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAlibaba-NLP%2FSeqGPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAlibaba-NLP%2FSeqGPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAlibaba-NLP%2FSeqGPT/lists"}