{"id":13525898,"url":"https://github.com/Alibaba-NLP/EcomGPT","last_synced_at":"2025-04-01T06:30:43.096Z","repository":{"id":190085297,"uuid":"681002347","full_name":"Alibaba-NLP/EcomGPT","owner":"Alibaba-NLP","description":"An Instruction-tuned Large Language Model for E-commerce","archived":false,"fork":false,"pushed_at":"2023-09-26T23:04:03.000Z","size":5129,"stargazers_count":223,"open_issues_count":10,"forks_count":14,"subscribers_count":5,"default_branch":"main","last_synced_at":"2024-11-02T10:34:04.834Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Alibaba-NLP.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-08-21T03:49:35.000Z","updated_at":"2024-10-20T22:58:18.000Z","dependencies_parsed_at":null,"dependency_job_id":"6ad82b8d-5f53-47c2-9cab-7a76dafa76b6","html_url":"https://github.com/Alibaba-NLP/EcomGPT","commit_stats":null,"previous_names":["alibaba-nlp/ecomgpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FEcomGPT","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FEcomGPT/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FEcomGPT/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Alibaba-NLP%2FEcomGPT/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Alibaba-NLP","download_url":"https://codeload.github.com/Alibaba-NLP/EcomGPT/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246596567,"owners_count":20802845,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-08-01T06:01:23.301Z","updated_at":"2025-04-01T06:30:41.776Z","avatar_url":"https://github.com/Alibaba-NLP.png","language":"Python","funding_links":[],"categories":["🤖 模型","A01_文本生成_文本对话","Data Resources"],"sub_categories":["🧩 领域模型","大语言对话模型及数据","2. Multilingual SFT Data"],"readme":"\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"IMG/logo.jpg\" width=\"55%\"\u003e\n\u003c/div\u003e\n\n# An Instruction-Following Large Language Model For E-commerce\n\n![](https://img.shields.io/badge/version-1.0.0-blue)[![Pytorch](https://img.shields.io/badge/PyTorch-%23EE4C2C.svg?e\u0026logo=PyTorch\u0026logoColor=white)](https://pytorch.org/)[![arxiv badge](https://img.shields.io/badge/arxiv-2308.06966-red)](https://arxiv.org/pdf/2308.06966.pdf)\n\nRepo for [*EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce*](https://arxiv.org/pdf/2308.06966)\n\n- **we proposed the first E-commerce instruction dataset EcomInstruct, with a total of 2.5 million instruction data**.\n- EcomInstruct scales up the data size and task diversity by constructing **atomic tasks with E-commerce basic data types**, such as product information, user reviews. Atomic tasks are defined as intermediate tasks implicitly involved in solving a final task, which we also call Chain-of-Task tasks. \n- We developed EcomGPT by training the backbone model BLOOMZ with the EcomInstruct. **Benefiting from the fundamental semantic understanding capabilities acquired from the Chain-of-Task tasks, EcomGPT exhibits excellent zero-shot generalization capabilities.**\n\n\u003cdiv align=\"center\"\u003e\n    \u003cimg src=\"IMG/method.jpg\" width=\"60%\" height=\"auto\" /\u003e\n\u003c/div\u003e\n\n## 💡 Perfomance\n\nWe perform a human evaluation on EcomGPT and ChatGPT using 12 E-commerce held-out datasets. EcomGPT outperforms or tied ChatGPT on 12 datasets.\n\n\u003cdiv align=\"center\"\u003e\n\u003cimg src=\"IMG/performance.jpg\" width=\"300\"\u003e\n\u003c/div\u003e\n\n## 🛠 Dependencies\n```bash\npip install -r requirement.txt\n```\n#### Details\n- Python (\u003e= 3.7)\n- [PyTorch](http://pytorch.org/) (\u003e= 2.0.0)\n- numpy\n- [Transformers](http://huggingface.co/transformers/) (\u003e= 4.27.4)\n- seqeval\n- rouge\n\n\n\n\n## 💻 Model\nThe EcomGPT (7b1) is available at [*ModelScope*](https://www.modelscope.cn/models/damo/nlp_ecomgpt_multilingual-7B-ecom/summary). \n\n## 📚 Dataset (EcomInstruct)\n\nWe first open source 12 evaluation datasets. To ensure evaluation efficiency, each evaluation dataset is sampled with only 500 instances.\n\n| Dataset   | Lang. | Task                          | Metric    |\n| :-------- | :---- | :---------------------------- | :-------- |\n| Lenove    | EN    | Named Entity Recognization    | F1, Rouge |\n| Lenove    | EN    | Entity Span Detection         | Rouge     |\n| Reddit    | EN    | Extractive QA                 | Rouge     |\n| ABSA      | EN    | Review Topic Classification   | F1, Rouge |\n| MEPAVE    | ZH    | Attribute Value Recognization | F1, Rouge |\n| MEPAVE    | ZH    | Attribute Value Detection     | Rouge     |\n| Multi-CPR | ZH    | Product Select                | Rouge     |\n| Multi-CPR | ZH    | Product Align                 | F1, Rouge |\n| OpenBG    | ZH    | Title Attritube Matching      | F1, Rouge |\n| OpenBG    | ZH    | Fine-grain Product Classify   | F1, Rouge |\n| OpenBG    | ZH    | Coarse-grain Product Classify | F1, Rouge |\n| OpenBG    | ZH    | Title Generate                | Rouge     |\n\nThe dataset files **satisfy the following file hierarchy**:\n\n```\n.\n├── [Dataset Name]\n│   └── tasks\n│       └── [task name]\n│           ├── meta-info.json\n│           └── test.json\n...\n└── Reddit_QA\n    └── tasks\n        └── EN-Reddit_QA-Extract-Extract_QA\n            ├── meta-info.json\n            └── test.json\n```\n\n## 🔍 Evaluation\n\nOne can evaluate the performance of EcomGPT with the following command：\n\n```bash\npython eval.py -tf ./test_tasks.txt -m [model name or path] -sn [result file name] -bdd [base dataset dir]\n```\n\n## 🔥 TODO\n\n- Open Source Weight of EcomGPT ✅\n\n## 📄 Citation\n\nIf you found this work useful, consider giving this repository a star and citing our paper as followed:\n\n```bigquery\n@article{li2023ecomgpt,\n  title={EcomGPT: Instruction-tuning Large Language Models with Chain-of-Task Tasks for E-commerce},\n  author={Li, Yangning and Ma, Shirong and Wang, Xiaobin and Huang, Shen and Jiang, Chengyue and Zheng, Hai-Tao and Xie, Pengjun and Huang, Fei and Jiang, Yong},\n  journal={arXiv preprint arXiv:2308.06966},\n  year={2023}\n}\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAlibaba-NLP%2FEcomGPT","html_url":"https://awesome.ecosyste.ms/projects/github.com%2FAlibaba-NLP%2FEcomGPT","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2FAlibaba-NLP%2FEcomGPT/lists"}