Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/fahimf/summarizer
A simple web app (Python/Flask) to summarize new papers from arXiv.org
https://github.com/fahimf/summarizer
Last synced: 14 days ago
JSON representation
A simple web app (Python/Flask) to summarize new papers from arXiv.org
- Host: GitHub
- URL: https://github.com/fahimf/summarizer
- Owner: FahimF
- Created: 2022-11-13T03:03:49.000Z (about 2 years ago)
- Default Branch: main
- Last Pushed: 2022-11-17T08:12:38.000Z (almost 2 years ago)
- Last Synced: 2023-03-23T18:09:56.522Z (over 1 year ago)
- Language: Python
- Size: 868 KB
- Stars: 6
- Watchers: 2
- Forks: 2
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# arXiv Paper Summarizer
This is a simple app which summarizes all new papers on arXiv under the Computer Science — Computer Vision category.
The app will store all downloaded paper information and summaries in a local database and will fetch new papers any time you click on the "Update" button on the UI. It will summarise only new papers which are not in the database currently so as to save on resources.
The list of papers can be searched and sorted and you can page through the whole list as you like.
You can tap on any item you are interested in and it will prompt you as to whether you want to view the paper on arXiv. If you say "Yes", it will open the page for the paper in a new browser tab.
The basic UI can be seen below:
![web-app](assets/web-app.jpg)
## Installation
Simply clone the repo to a local folder, install the required packages (if you don't have them already) and then run the following command from terminal:
```bash
python web.py
```This should start a local web server that you can access via the URL that you'll see in the terminal. Just navigate to the URL and you should be set!
## What are the required packages?
You should be able to install the required packages by running the following command in terminal:
```bash
pip install feedparser torch transformers Flask
```## How does it work?
The app fetches the RSS feed for the hardcoded category (cs.CV) and then compares each paper against those already in the database. Any paper which is not found in the database has its description passed through the summarizer which uses [this model](https://huggingface.co/pszemraj/long-t5-tglobal-base-16384-book-summary).
You can use a different model which supports text summarizing instead of this one and it might possibly yield better results. I have not done enough testing yet to see if there is a better model that can be used. If you do find one, do let me know 🙂
## Does this work only with arXiv?
As the code stands now, yes, it only works with arXiv. But if you have other RSS feeds (or even different sources of papers) that you want to use with this code, it should be fairly straightforward to modify the code to work with these other sources provided they provide some sort of a description of the paper that can be summarized.
## Future Plans
I do want to add at least the following features to the app:
- [ ] The ability to delete/hide papers that I'm not interested in
- [ ] The ability to not have deleted/hidden papers be downloaded/summarized again
- [ ] The ability to add more than one category on arXiv to fetch papers from
- [ ] The ability to switch between different categories (once the above feature is in)