https://github.com/curiouslearner/geeksforgeeksscrapper
Scrapes g4g and creates PDF
https://github.com/curiouslearner/geeksforgeeksscrapper
geeksforgeeks hacktoberfest pdf scrapper webscraper webscraping
Last synced: about 1 year ago
JSON representation
Scrapes g4g and creates PDF
- Host: GitHub
- URL: https://github.com/curiouslearner/geeksforgeeksscrapper
- Owner: CuriousLearner
- License: mit
- Created: 2015-11-29T18:24:22.000Z (over 10 years ago)
- Default Branch: master
- Last Pushed: 2020-05-15T17:24:00.000Z (about 6 years ago)
- Last Synced: 2025-05-07T10:15:13.980Z (about 1 year ago)
- Topics: geeksforgeeks, hacktoberfest, pdf, scrapper, webscraper, webscraping
- Language: Python
- Homepage:
- Size: 147 KB
- Stars: 147
- Watchers: 9
- Forks: 65
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# GeeksForGeeksScrapper
Scrapes [GeeksForGeeks](http://www.geeksforgeeks.org) and creates html & PDF for chosen category along with syntax highlighting for the code.
## Screenshots
Example of articles of C category as html:

Example of articles of C category as pdf:

## Installation
To use the scrapper, install the following:
`$ sudo apt-get install wkhtmltopdf`
Then create venv
`$ virtualenv /path/to/g4g-env`
Switch to venv
`$ source /path/to/g4g-env/bin/activate`
Now install BeautifulSoup as:
`$ pip install beautifulsoup4`
or via package manager as:
`$ sudo apt-get install python-bs4`
or for Python dependencies, you can just install via `requirements.txt` inside the virtual environment.
`$ pip install -r requirements.txt`
## Run the G4G_Scrapper
$ python g4g.py
Choose the category you want to scrape from the menu and wait for the magic to happen :)
You can find the output as `G4G_.html` and `G4G_.pdf` in the same directory.
### Disclaimer: This is strictly for educational purpose only. Author will not be liable for anything, what so ever; you do with this script.