Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/Mphasis-ML-Marketplace/Mphasis-DeepInsights-Text-Summarizer

Text Summarizer solution is an optimal way to tackle the problem of information overload by reducing the size of long documents into a few sentences . Neural-network-based models have the ability to automatically learn the distributed representation for sentences and documents. This summarizer is built using Transfer Learning and Transformer based models which use self attention. The input can have a maximum of 512 words and gives output of 3 sentences (approximately 30 words).
https://github.com/Mphasis-ML-Marketplace/Mphasis-DeepInsights-Text-Summarizer

Last synced: 3 months ago
JSON representation

Host: GitHub
URL: https://github.com/Mphasis-ML-Marketplace/Mphasis-DeepInsights-Text-Summarizer
Owner: Mphasis-ML-Marketplace
License: apache-2.0
Created: 2020-12-24T04:43:27.000Z (about 4 years ago)
Default Branch: main
Last Pushed: 2021-01-05T07:02:28.000Z (about 4 years ago)
Last Synced: 2024-08-02T13:15:10.868Z (7 months ago)
Language: Jupyter Notebook
Size: 75.2 KB
Stars: 2
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Mphasis-DeepInsights-Text-Summarizer

## Amazon SageMaker

### Input :

**Usage Methodology for the algorithm:**

The input has to be a '.txt' file with 'utf-8' encoding. PLEASE NOTE: If your input .txt file is not 'utf-8' encoded, model will not perform as expected
1. To make sure that your input file is 'UTF-8' encoded please 'Save As' using Encoding as 'UTF-8'
2. The input can have a maximum of 512 words (Sagemaker restriction)
3. Input should have atleast 3 sentences (Model limitation)
4. Supported content types: text/plain

### Output:

Content type: text/plain

### Invoking endpoint

AWS CLI Command
If you are using real time inferencing, please create the endpoint first and then use the following command to invoke it:

`aws sagemaker-runtime invoke-endpoint --endpoint-name "endpoint-name" --body fileb://input.txt --content-type text/plain --accept text/plain result.txt`

**Substitute the following parameters:**

* "endpoint-name" - name of the inference endpoint where the model is deployed
* input.txt - input file
* text/plain - MIME type of the given input file (above)
* result.txt - filename where the inference results are written to.

### Python

Real-time inference snippet (more detailed example can be found in sample notebook):

```sample_txt = 'location of input text file
transformer = model.transformer(1, 'ml.m5.xlarge')
transformer.transform(sample_txt, content_type="text/plain")
transformer.wait()
print("Batch Transform output saved to " + transformer.output_path)
```

## Resources

1. [Sample Notebook](text_summary_marketplace.ipynb)
2. [Sample Input](SampleInput)
3. [Sample Output](SampleOutput)