https://github.com/kenmwaura1/simple-web-scraper
Simple web scraper to get player data using beatiful-soup4 and PostgreSQL as a database. SQLAlchemy as an ORM
https://github.com/kenmwaura1/simple-web-scraper
beautifulsoup4 postgres postgresql python requests scraper sqlalchemy
Last synced: about 2 months ago
JSON representation
Simple web scraper to get player data using beatiful-soup4 and PostgreSQL as a database. SQLAlchemy as an ORM
- Host: GitHub
- URL: https://github.com/kenmwaura1/simple-web-scraper
- Owner: KenMwaura1
- License: mit
- Created: 2020-02-05T18:27:17.000Z (over 5 years ago)
- Default Branch: master
- Last Pushed: 2025-08-04T02:42:00.000Z (2 months ago)
- Last Synced: 2025-08-07T21:07:44.562Z (2 months ago)
- Topics: beautifulsoup4, postgres, postgresql, python, requests, scraper, sqlalchemy
- Language: Python
- Homepage:
- Size: 653 KB
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# simple-web-scraper

This repository contains code for a webscraper for [Lifetime Leaderboards \| UMG Gaming](https://www.umggaming.com/leaderboards)
Making use of the beatifulsoup4 and requests
Postgres as a Database
SQLAlchemy is used as a ORM to insert data into the db## Accompanying Blog Post
[Get Started with a Web Scraping Project](https://dev.to/ken_mwaura1/getting-started-with-a-web-scraping-project-10ej)
## Installation
1. Clone this Repo
`git clone https://github.com/KenMwaura1/simple-web-scraper`
2. Change into into the simple-web-scraper folder
`cd simple-web-scraper`
3. Create a virtualenv
`python3 -m virtualenv env`
4. Activate virtualenv
`source /bin/activate` OR use pipenv `pipenv install`
5. Install the required dependecies
`python3 -m pip install -r requirements.txt`
## Database
Ensure Postgres is installed locally and running on port 5432
OR
run a Postgres container on Docker [Easy PostgreSQL 10 and pgAdmin 4 Setup with Docker](https://info.crunchydata.com/blog/easy-postgresql-10-and-pgadmin-4-setup-with-docker) container on port 5432
The default db credentials are:
* host: localhost
* port: 5432
* user: test
* db: xp_db
* password: testpasswordOptionally can also set the db credentials as environment variables
```shell
export DATABASE_URL='postgres://test:testpassword@localhost:5432/xp_db'
```or copy the included .env example file
```shell
cp .env.example .env
```Add your credentials to the .env file.
## Usage
1.Run the scraper
1a. To run the scraper and insert xp data into the db
```shell
python xp_scrape.py
```1b. To run the scraper and insert earnings data into the db
```shell
python earnings_scrape.py
```## Tests
To run the tests run the following command at the root of the project.
```shell
pytest .
```