https://github.com/ct83/pyscrapperserver

This script is a Python Scrapper controlled via a Web Interface which uses Bottle; BeautifulSoup 4 is used for scrapping EBooks off Websites which host them for free.
https://github.com/ct83/pyscrapperserver

Last synced: about 1 month ago
JSON representation

This script is a Python Scrapper controlled via a Web Interface which uses Bottle; BeautifulSoup 4 is used for scrapping EBooks off Websites which host them for free.

Host: GitHub
URL: https://github.com/ct83/pyscrapperserver
Owner: CT83
License: gpl-3.0
Created: 2017-09-08T02:27:57.000Z (almost 9 years ago)
Default Branch: master
Last Pushed: 2017-12-17T07:16:34.000Z (over 8 years ago)
Last Synced: 2025-12-01T03:19:30.567Z (7 months ago)
Language: Python
Homepage:
Size: 213 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# PyScrapperServer
This script is a Python Scrapper controlled vai a Web Interface which uses Bottle; BeautifulSoup 4 is used for scrapping EBooks off Websites which host them for free.

Preface
This project was entirely created on a Friday when I called in sick from College. This project provided me an introduction to Web Servers, Web Frameworks, Scrapping in Python and the Bottle Framework.

Introduction
The code is run on a Raspberry Pi connected to the local WiFi connection, preferably using a Static IP. The user accesses the Web Interfaces hosted on the Pi and then pastes the link of the EBook that he wishes to download. The Ebook is then scrapped off the Link provided by the user. It is further converted to PDF format for easy reading.
tl;dr Web Server which Scrapes the Web.

How to Install?
1. Clone this Repo
2. Install Dependencies
`sudo pip install reportlab requests bs4 python-dev install bottle`
3. `sudo nano /etc/rc.local`
4. Add `sudo python path_of_this_cloned_repo/BottleServer.py` to the end of the file before `exit 0` , to allow the server to run at boot.
5. `sudo reboot`

Done!
Now visit the IP Address of the Server example`192.168.1.10:8080`

Dependencies
1. [txt2pdf](https://github.com/baruchel/txt2pdf%22txt2pdf%22)
2. [bottle](https://github.com/bottlepy/bottle%22bottle%22)
3. [reportlab](https://github.com/Distrotech/reportlab)
4. [Beautiful Soup](https://code.launchpad.net/beautifulsoup)
5. [requests](https://github.com/requests/requests)

Conclusion
This project successfully downloaded several Ebooks from Websites and so was a succcess.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/ct83/pyscrapperserver

Awesome Lists containing this project

README