Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/suhan-paradkar/libgenbot
A Python Scraper Designed to Scrape PDFs from Libgen and Scihub
https://github.com/suhan-paradkar/libgenbot
cli pip pypi pypi-package python python3 science scraper script terminal
Last synced: about 2 months ago
JSON representation
A Python Scraper Designed to Scrape PDFs from Libgen and Scihub
- Host: GitHub
- URL: https://github.com/suhan-paradkar/libgenbot
- Owner: suhan-paradkar
- License: gpl-3.0
- Created: 2021-05-16T14:32:54.000Z (over 3 years ago)
- Default Branch: main
- Last Pushed: 2024-05-13T16:19:17.000Z (8 months ago)
- Last Synced: 2024-11-01T02:21:56.854Z (about 2 months ago)
- Topics: cli, pip, pypi, pypi-package, python, python3, science, scraper, script, terminal
- Language: Python
- Homepage:
- Size: 111 KB
- Stars: 21
- Watchers: 2
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Funding: .github/FUNDING.yml
- License: LICENSE
Awesome Lists containing this project
README
# Libgenbot
[![Hits](https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fsuhan-paradkar%2FLibgenbot&count_bg=%2379C83D&title_bg=%23555555&icon=&icon_color=%23E7E7E7&title=hits&edge_flat=false)](https://hits.seeyoufarm.com)[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)
[![PyPI version](https://badge.fury.io/py/Libgenbot.svg)](https://badge.fury.io/py/Libgenbot)
Libgenbot is a Bot written in Python to download PDFs from libgen.
It is a fork of PyPaperBot, and is inspired by it
Please leave feedback and report issues## Installation
Use pip to install LibgenBot
```
pip3 install Libgenbot
```For builds with latest changes
```
git clone https://github.com/suhan-paradkar/Libgenbot.git
pip3 install -r requirements.txt
python3 setup.py install
```## Installation in termux
First, you need to be subscribed into its-pointless repo
```
pkg up
pkg install wget git
wget https://its-pointless.github.io/setup-pointless-repo.sh
chmod +x setup-pointless-repo.sh
./setup-pointless-repo.sh
```Now, you need to install numpy
```
pkg install numpy
```Now, install pandas.... It takes a bit long time... so have a cup of tea
```
export CFLAGS="-Wno-deprecated-declarations -Wno-unreachable-code"
pip install pandas
```Now, install using pip
```
pip install Libgenbot
```For builds with latest changes
```
git clone https://github.com/suhan-paradkar/Libgenbot.git
pip install -r requirements.txt
python setup.py install
```## Usage
| Arguments | Description | Type |
| ------------------ | ---------------------------------------------------------------------------------------- | ------ |
| `--query` | Query to make on Libgen page | string |
| `--genre` | select genre: one of 'libgen(Sci-Tech)[1]''Scientific articles[2]' 'Fiction[3]' . Is a must when using libgen | Int |
| `--scholar-query` | Query to be made on the Google Scholar page | string |
| `--doi` | DOI of the paper to download (this option uses only SciHub to download) | string |
| `--doi-file` | File .txt containing the list of paper's DOIs to download | string |
| `--libgen-pages` | Number or range of Libgen pages to inspect. Contains variable no. of pages | string |
| `--scholar-pages` | Number or range of Google Scholar pages to inspect. Each page has a maximum of 10 papers | string |
| `--libgen-results` | Number of papers to download. Useful When \-\-libgen-pages=1 | int |
| `--scholar-results`| Number of papers to download. Useful When \-\-scholar-pages=1 | int |
| `--dwn-dir` | Directory path in which to save the result | string |
| `--min-year` | Minimal publication year of the paper to download | int |
| `--max-dwn-year` | Maximum number of papers to download sorted by year | int |
| `--max-dwn-cites` | Maximum number of papers to download sorted by number of citations | int |
| `--journal-filter` | CSV file path of the journal filter . Only works on Scholar | string |
| `--restrict` | 0:Download only Bibtex - 1:Down load only papers PDF | int |
| `--scihub-mirror` | Mirror for downloading papers from sci-hub. If not set, it is selected automatically | string |
| `--proxy` | Use Proxychains. Provide a seperated list of proxies (See below) | string |
| `-h` | Shows the help | -- |## Note
You can use only one of the arguments in the following groups
`--query`, `--scholar-query` `--doi-file`, and `--doi`
`--max-dwn-year` and `and max-dwn-cites`One of the arguments `--doi`, `--query`, `--scholar-query` , and `--file` is mandatory
The arguments `--scholar-pages` is mandatory when using `--scholar-query`
The argument `--dwn-dir` is mandatory.
The argument `--genre` is mandatory when using `--query`The argument `--journal-filter` require the path of a CSV containing a list of journal name paired with a boolean which indicates whether or not to consider that journal (0: don't consider /1: consider)
The argument `--doi-file` require the path of a txt file containing the list of paper's DOIs to download organized with one DOI per line.
The argument `--proxy` must be used at the end of the command. The protocol used and the port must be mentioned.
#### Usage of Proxy
```
Libgenbot --query=rheumatoid+arthritis --libgen-pages=1 --libgen-results=20 --genre=1 --dwn-dir=documents/ --proxy http://1.1.1.1:8080 http://8.0.8.0:8024
```