https://github.com/astrabert/biomedicalpapersbot
A Gradio bot to retrieve PubMed papers' title, doi, authors and publication date based on general search terms or on specific publication names
https://github.com/astrabert/biomedicalpapersbot
academic-research academic-resources automation biomedical gradio papers pubmed-parser python scraping
Last synced: 5 months ago
JSON representation
A Gradio bot to retrieve PubMed papers' title, doi, authors and publication date based on general search terms or on specific publication names
- Host: GitHub
- URL: https://github.com/astrabert/biomedicalpapersbot
- Owner: AstraBert
- License: mit
- Created: 2023-11-13T11:26:48.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-11-05T00:24:44.000Z (7 months ago)
- Last Synced: 2025-01-13T02:03:50.088Z (5 months ago)
- Topics: academic-research, academic-resources, automation, biomedical, gradio, papers, pubmed-parser, python, scraping
- Language: Python
- Homepage:
- Size: 321 KB
- Stars: 2
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# BioMedicalPapersBot
A Telegram bot to retrieve the title, doi, authors and publication date of papers on PubMed, starting on general search terms or on specific publication names## How to activate it
You can pull it from GitHub Docker Container registry:```bash
docker pull ghcr.io/astrabert/biomedicalpapersbot:main
docker run -p 7860:7860 ghcr.io/astrabert/biomedicalpapersbot:main
```Or you can clone the repository:
```bash
git clone https://github.com/AstraBert/BioMedicalPapersBot
cd BioMedicalPapersBot
```Create a virtual environment and activate it:
```bash
python3 -m venv virtualenv
source virtualenv/bin/activate
```Install the required dependencies:
```bash
python3 -m pip install -r requirements.txt
```Run the application:
```bash
python3 scripts/app.py
```In both cases, you will find the application on http://localhost:7860
Find a demo [here](https://huggingface.co/spaces/as-cle-bert/BioMedicalPapersBot).
## Description
It is a (bio)python-based Gradio bot that searches PubMed and returns the features of the papers that correspond to the search.You can find a snippet code of the functions used to retrieve and parse data from PubMed in [pubmedScraper.py](./scripts/pubmedScraper.py). The workflow is pretty simple:
- `search_pubmed` does the actual webscraping, thanks to the Entrez NCBI module, that remotely connects to online servers and communicate with them: the function returns a list of PubMed IDs
- `fetch_pubmed_details`, thanks to a faster access to paper metadata and data with the IDs from the previous function, retrieves significant information about papers and outputs it in standard XML format
- `fetch_xml` takes care of parsing the XML output and extracting titles, authors, dates of publication and DOIs.
- `respond_to_query` outputs the information of interest in a format that is human-readable and message-sendableYou can also find the basic architecture of the python code that is used for the Gradio bot itself.
Keep in mind that there are several ways to define a python bot: thus, if you find a faster or better implementation for it, feel free to suggest it in the `ISSUE` section.
## Funding
If you found this project useful, please consider to [fund it](https://github.com/sponsors/AstraBert) and make it grow: let's support open-source together!😊
## License and rights of usage
This project is provided under [MIT license](./LICENSE): it will always be open-source and free to use.
If you use this project, please cite the author: [Astra Clelia Bertelli](https://astrabert.vercel.app)