Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dmmiller612/lecture-summarizer
Lecture summarization with BERT
https://github.com/dmmiller612/lecture-summarizer
bert bert-model extractive-summarization flask pytorch summarization
Last synced: 3 months ago
JSON representation
Lecture summarization with BERT
- Host: GitHub
- URL: https://github.com/dmmiller612/lecture-summarizer
- Owner: dmmiller612
- Created: 2019-02-28T00:06:22.000Z (almost 6 years ago)
- Default Branch: master
- Last Pushed: 2022-10-01T07:52:54.000Z (over 2 years ago)
- Last Synced: 2024-10-13T00:11:10.377Z (3 months ago)
- Topics: bert, bert-model, extractive-summarization, flask, pytorch, summarization
- Language: Python
- Homepage: https://arxiv.org/abs/1906.04165
- Size: 271 KB
- Stars: 148
- Watchers: 8
- Forks: 38
- Open Issues: 9
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-bert - dmmiller612/lecture-summarizer
README
# lecture-summarizer
This project utilizes the BERT model to perform extractive text summarization on lecture transcripts. The contents of
this project include a RESTful API to serve these summaries, and a command line interface for easier interaction. You can
find more about the specs of this service and CLI in our `Documentation` directory.zPaper: https://arxiv.org/abs/1906.04165
## Running the service locally
First, docker is required to run the service locally. To start the service, run the command:
```bash
make docker-build-run
```
On the first run of a service, this may take quite some time to complete.## Installation of CLI
The CLI tool can be downloaded using pip with the following command:
```bash
pip install git+https://github.com/dmmiller612/lecture-summarizer.git
```To test the tool, try getting the current lectures in the service with the command:
```bash
lecture-summarizer get-lectures
```Note, that this tool automatically uses our cloud based service by default. You can use your local service by supplying
the `-base_path` option, such as `-base_path localhost:5000`. As an example, to get lectures locally, you could run:```bash
lecture-summarizer get-lectures -base-path localhost:5000
```## How to use CLI tool
After installing the CLI, the service should be ready to use. The lecture-summarizer uses the API service as
it's backend. This backend defaults to the currently hosted one on AWS. The user can supply a specific URL if the
service is hosted elsewhere. Below, briefly discusses how to use the CLI tool.### Creating a Lecture
Before one can do anything with summarizations, there needs to be at least one lecture in the system. Taking an Udacity
lecture, using the `raw_lecture.txt` file at the parent of the lecture-summarizer directory as an example, one can upload
the content issuing the following command:```bash
lecture-summarizer create-lecture -path ./raw_lecture.txt -name example_first_lecture -course IHI
```Currently, the lecture-summarizer can parse sdp file formats, which are common for Udacity-based lectures. Notice that one
needs to supply a `name` and a `course` as metadata.### Retrieving Lectures
One can retrieve lectures with a couple of options. Those options can be found in the Documentation/CLI_Documentation.md
file in the base of the repo. Some example commands are shown below:##### Get a Single Lecture
```bash
lecture-summarizer get-lectures -lecture-id 1
```##### Get All Lectures
```bash
lecture-summarizer get-lectures
```##### Get Lectures by Name
```bash
lecture-summarizer get-lectures -name example_first_lecture
```##### Get Lectures by Course
```bash
lecture-summarizer get-lectures -course ihi
```### Creating a Summary
Just like creating a lecture, creating a summary is a painless process. Below is an example of creating a summary from
a specified lecture.```bash
lecture-summarizer create-summary -lecture-id 1 -name 'my summary name' -ratio 0.2
```The `ratio` specifies approximately how much of the lecture that you want to summarize.
### Retrieving Summaries
Just like with retrieving lectures, one can also list summaries. Below are a couple of examples:
##### Get a Single Summary
```bash
lecture-summarizer get-summaries -lecture-id 1 -summary-id 1
```##### Get All Summaries
```bash
lecture-summarizer get-summaries -lecture-id 1
```##### Get All Summaries by Name
```bash
lecture-summarizer get-summaries -lecture-id 1 -name 'my summary name'
```##### Delete a Summary
```bash
lecture-summarizer delete-summary -lecture-id 1 -summary-id 1
```## RESTful API Docs
##### POST /lectures
This endpoint creates a lecture.
```json
{
"course": "course identifier",
"content": "Lecture String Content",
"name": "Lecture name"
}
```##### GET /lectures
This endpoint is used to retrieve lectures. The user can supply two query params shown below.
```
/lectures?course=unique_identifier
/lectures?name=course_name
```##### GET /lectures/{id}
This endpoint is used to retrieve a single lecture
```
/lectures/{id}
```##### POST /lectures/{id}/summaries
This endpoint is used to create a summarization from a lecture
```json
{
"name": "Summarization name",
"ratio": "Ratio of sentences to select"
}
```##### GET /lectures/{id}/summaries
```
/lectures/{id}/summaries?name=course_name
/lectures/{id}/summaries
```##### GET/DELETE /lectures/{id}/summaries/{summarization_id}
This endpoint allows you to get or delete a summarization.
```
/lectures/{id}/summaries/{summarization_id}
```