{"id":15030485,"url":"https://github.com/baidu/senta","last_synced_at":"2025-05-14T19:06:27.907Z","repository":{"id":38345612,"uuid":"139385822","full_name":"baidu/Senta","owner":"baidu","description":"Baidu's open-source Sentiment Analysis System.","archived":false,"fork":false,"pushed_at":"2024-08-20T16:16:48.000Z","size":27259,"stargazers_count":1951,"open_issues_count":74,"forks_count":369,"subscribers_count":61,"default_branch":"master","last_synced_at":"2025-04-03T02:55:36.754Z","etag":null,"topics":["aspect-level-sentiment","natural-language-processing","opinion-target-extraction","paddlepaddle","sentiment-analysis","sentiment-classification"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/baidu.png","metadata":{"files":{"readme":"README.en.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":"AUTHORS","dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-07-02T03:22:35.000Z","updated_at":"2025-03-31T19:31:38.000Z","dependencies_parsed_at":"2024-11-06T17:39:12.725Z","dependency_job_id":"f2741216-61ce-4905-ae61-7c72dd3b66ce","html_url":"https://github.com/baidu/Senta","commit_stats":{"total_commits":29,"total_committers":5,"mean_commits":5.8,"dds":0.5172413793103448,"last_synced_commit":"e5294c00a6ffc4b1284f38000f0fbf24d6554c22"},"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baidu%2FSenta","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baidu%2FSenta/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baidu%2FSenta/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/baidu%2FSenta/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/baidu","download_url":"https://codeload.github.com/baidu/Senta/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161262,"owners_count":21057552,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aspect-level-sentiment","natural-language-processing","opinion-target-extraction","paddlepaddle","sentiment-analysis","sentiment-classification"],"created_at":"2024-09-24T20:13:28.790Z","updated_at":"2025-04-10T04:51:59.974Z","avatar_url":"https://github.com/baidu.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"English|[简体中文](https://github.com/baidu/Senta/blob/master/README.md)\n\n# \u003cp align=center\u003e`Senta`\u003c/p\u003e\n\n`Senta` is a python library for many sentiment analysis tasks. It contains support for running multiple tasks such as sentence-level sentiment classification, aspect-level sentiment classification and opinion role labeling. The bulk of the code in this repository is used to implement [SKEP: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis](https://www.aclweb.org/anthology/2020.acl-main.374.pdf). In the paper, we demonstrate how to integrate sentiment knowledge into pre-trained models to learn a unified sentiment representation for multiple sentiment analysis tasks.\n\n## How to use\n\n### Pip\n\nYou can directly use the Python package to predict sentiment analysis tasks by loading a pre-trained `SKEP` model.\n\n#### Installation\n\n1. `Senta` supports Python 3.6 or later. This repository requires PaddlePaddle 1.6.3, please see [here](https://www.paddlepaddle.org.cn/documentation/docs/en/1.6/beginners_guide/install/index_en.html) for installaton instruction.\n\n2. Install `Senta`\n\n    ```shell\n    python -m pip install Senta\n    ```\n   or\n\n    ```shell\n    git clone https://github.com/baidu/Senta.git\n    cd Senta\n    python -m pip install .\n    ```\n\n#### Quick Tour\n\n    ```python\n    from senta import Senta\n    my_senta = Senta()\n    \n    # get pre-trained model, we provide three pre-trained models, all of which are based on the SKEP\n    print(my_senta.get_support_model()) # [\"ernie_1.0_skep_large_ch\", \"ernie_2.0_skep_large_en\", \"roberta_skep_large_en\"]\n                                        # ernie_1.0_skep_large_ch, skep Chinese pre-trained model based on ERNIE 1.0 large.\n                                        # ernie_2.0_skep_large_en, skep English pre-trained model based on ERNIE 2.0 large.\n                                        # roberta_skep_large_en, skep English pre-trained model based on RoBERTa large, which is used in our paper.\n    \n    # get supported task\n    print(my_senta.get_support_task()) # [\"sentiment_classify\", \"aspect_sentiment_classify\", \"extraction\"]\n    \n    use_cuda = True # set True or False\n    \n    # predict different tasks\n    my_senta.init_model(model_class=\"roberta_skep_large_en\", task=\"sentiment_classify\", use_cuda=use_cuda)\n    texts = [\"a sometimes tedious film .\"]\n    result = my_senta.predict(texts)\n    print(result)\n    \n    my_senta.init_model(model_class=\"roberta_skep_large_en\", task=\"aspect_sentiment_classify\", use_cuda=use_cuda)\n    texts = [\"I love the operating system and the preloaded software.\"]\n    aspects = [\"operating system\"]\n    result = my_senta.predict(texts, aspects)\n    print(result)\n    \n    my_senta.init_model(model_class=\"roberta_skep_large_en\", task=\"extraction\", use_cuda=use_cuda)\n    texts = [\"The JCC would be very pleased to welcome your organization as a corporate sponsor .\"]\n    result = my_senta.predict(texts)\n    print(result)\n    ```\n\n### From source\n\nYou can use the source code to run pre-training and fine-tuning tasks. The `config` folder has different files to help you reproduce the results of our paper.\n\n#### Preparation\n\n    ```shell\n    # download code\n    git clone https://github.com/baidu/Senta.git\n    \n    # download a pre-trained skep model\n    cd ./Senta/model_files\n    sh download_roberta_skep_large_en.sh # download roberta_skep_large_en model. For other pre-trained skep models, you can find them in this dir.\n    cd -\n    \n    # download task dataset\n    cd ./Senta/data/\n    sh download_en_data.sh # download English dataset used in our paper. For Chinese dataset, you can find its download script in this dir.\n    cd - \n    ```\n\n#### Installation\n\n1. `Senta` supports Python 3.6 or later. This repository requires PaddlePaddle 1.6.3, please see [here](https://www.paddlepaddle.org.cn/documentation/docs/en/1.6/beginners_guide/install/index_en.html) for installaton instruction.\n\n2. Install python dependencies\n\n    ```shell\n    python -m pip install -r requirements.txt\n    ```\n\n3. Set up environment variables such as Python, CUDA, cuDNN, PaddlePaddle in `env.sh` file. Details about environment variables related to PaddlePaddle can be found at the [PaddlePaddle Documentation](https://www.paddlepaddle.org.cn/documentation/docs/en/1.6/flags_en.html).\n\n#### Quick Tour\n\n1. Training\n   \n    ```shell\n    sh ./script/run_pretrain_roberta_skep_large_en.sh # pre-trained model roberta_skep_large_en, which is used in our paper\n    ```\n\n2. Fine-tuning and predict\n\n    ```shell \n    sh ./script/run_train.sh ./config/roberta_skep_large_en.SST-2.cls.json # fine-tuning on SST-2\n    sh ./script/run_infer.sh ./config/roberta_skep_large_en.SST-2.infer.json # predict\n    \n    sh ./script/run_train.sh ./config/roberta_skep_large_en.absa_laptops.cls.json # fine-tuning on ABSA(laptops)\n    sh ./script/run_infer.sh ./config/roberta_skep_large_en.absa_laptops.infer.json # predict\n    \n    sh ./script/run_train.sh ./config/roberta_skep_large_en.MPQA.orl.json # fine-tuning on MPQA 2.0\n    sh ./script/run_infer.sh ./config/roberta_skep_large_en.MPQA.infer.json # predict\n    ```\n    \n3. An old version of `Senta` can be found at [here](https://github.com/baidu/Senta/tree/v1), which includes BoW, CNN and BiLSTM models for Chinese sentence-level sentiment classification.\n\n\n## Citation\n\nIf you extend or use this work, please cite the [paper](https://www.aclweb.org/anthology/2020.acl-main.374.pdf) where it was introduced:\n\n```text\n@inproceedings{tian-etal-2020-skep,\n    title = \"{SKEP}: Sentiment Knowledge Enhanced Pre-training for Sentiment Analysis\",\n    author = \"Tian, Hao  and\n      Gao, Can  and\n      Xiao, Xinyan  and\n      Liu, Hao  and\n      He, Bolei  and\n      Wu, Hua  and\n      Wang, Haifeng  and\n      wu, feng\",\n    booktitle = \"Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics\",\n    month = jul,\n    year = \"2020\",\n    address = \"Online\",\n    publisher = \"Association for Computational Linguistics\",\n    url = \"https://www.aclweb.org/anthology/2020.acl-main.374\",\n    pages = \"4067--4076\",\n    abstract = \"Recently, sentiment analysis has seen remarkable advance with the help of pre-training approaches. However, sentiment knowledge, such as sentiment words and aspect-sentiment pairs, is ignored in the process of pre-training, despite the fact that they are widely used in traditional sentiment analysis approaches. In this paper, we introduce Sentiment Knowledge Enhanced Pre-training (SKEP) in order to learn a unified sentiment representation for multiple sentiment analysis tasks. With the help of automatically-mined knowledge, SKEP conducts sentiment masking and constructs three sentiment knowledge prediction objectives, so as to embed sentiment information at the word, polarity and aspect level into pre-trained sentiment representation. In particular, the prediction of aspect-sentiment pairs is converted into multi-label classification, aiming to capture the dependency between words in a pair. Experiments on three kinds of sentiment tasks show that SKEP significantly outperforms strong pre-training baseline, and achieves new state-of-the-art results on most of the test datasets. We release our code at https://github.com/baidu/Senta.\",\n}\n```\n    ","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaidu%2Fsenta","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbaidu%2Fsenta","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbaidu%2Fsenta/lists"}