Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/hyunwoongko/summarizers
Package for controllable summarization
https://github.com/hyunwoongko/summarizers
nlp summarization
Last synced: 18 days ago
JSON representation
Package for controllable summarization
- Host: GitHub
- URL: https://github.com/hyunwoongko/summarizers
- Owner: hyunwoongko
- License: apache-2.0
- Created: 2021-03-08T15:29:33.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2022-12-07T18:17:23.000Z (almost 2 years ago)
- Last Synced: 2024-10-13T23:41:12.245Z (about 1 month ago)
- Topics: nlp, summarization
- Language: Python
- Homepage:
- Size: 58.6 KB
- Stars: 78
- Watchers: 4
- Forks: 10
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
- Text-Summarization-Repo - summarizers 라이브러리
README
# summarizers
[![PyPI version](https://badge.fury.io/py/summarizers.svg)](https://badge.fury.io/py/summarizers)
![GitHub](https://img.shields.io/github/license/summarizers/summarizers)
- `summarizers` is package for controllable summarization based on [CTRLsum](https://github.com/salesforce/ctrl-sum).
- currently, we only supports English. It doesn't work in other languages.
## Installation
```console
pip install summarizers
```## Usage
### 1. Create Summarizers
- First at all, create summarizers obejct to summarize your own article.
```python
>>> from summarizers import Summarizers
>>> summ = Summarizers()
```
- You can select type of source article between [`normal`, `paper`, `patent`].
- If you don't input any parameter, default type is `normal`.
```python
>>> from summarizers import Summarizers
>>> summ = Summarizers('normal') # <-- default.
>>> summ = Summarizers('paper')
>>> summ = Summarizers('patent')
```
- If you want GPU acceleration, set param `device='cuda'`.
```python
>>> from summarizers import Summarizers
>>> summ = Summarizers('normal', device='cuda')
```### 2. Basic Summarization
- If you inputted source article, basic summariztion is conducted.
```python
>>> contents = """
Tunip is the Octonauts' head cook and gardener.
He is a Vegimal, a half-animal, half-vegetable creature capable of breathing on land as well as underwater.
Tunip is very childish and innocent, always wanting to help the Octonauts in any way he can.
He is the smallest main character in the Octonauts crew.
"""
```
```python
>>> summ(contents)
'Tunip is a Vegimal, a half-animal, half-vegetable creature'
```### 3. Query focused Summarization
- If you want to input query together, Query focused summarization conducted.
```python
>>> summ(contents, query="main character of Octonauts")
'Tunip is the smallest main character in the Octonauts crew.'
```### 3. Abstractive QA (Auto Question Detection)
- If you inputted question as query, Abstractive QA is conducted.
```python
>>> summ(contents, query="What is Vegimal?")
'Half-animal, half-vegetable'
```
- You can turn off this feature by setting param `question_detection=False`.
```python
>>> summ(contents, query="SOME_QUERY", question_detection=False)
```### 4. Prompt based Summarization
- You can generate summary that begins with some sequence using param `prompt`.
- It works like GPT-3's Prompt based generation. (but It doesn't work very well.)
```python
>>> summ(contents, prompt="Q:Who is Tunip? A:")
"Q:Who is Tunip? A: Tunip is the Octonauts' head"
```### 5. Query focused Summarization with Prompt
- You can also input both `query` and `prompt`.
- In this case, a query focus summary is generated that starts with a prompt.
```python
>>> summ(contents, query="personality of Tunip", prompt="Tunip is very")
"Tunip is very childish and innocent, always wanting to help the Octonauts."
```### 6. Options for Decoding Strategy
- For generative models, decoding strategy is very important.
- `summarizers` support variety of options for decoding strategy.
```python
>>> summ(
... contents=contents,
... num_beams=10,
... top_k=30,
... top_p=0.85,
... no_repeat_ngram_size=3,
... length_penalty=1.2,
... )```
## License
```
Copyright 2021 Hyunwoong Ko.Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License athttp://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
```