Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/layer-se7en/web-scraping-sandbox

Python scripts for scraping data from Scrapethissite.com
https://github.com/layer-se7en/web-scraping-sandbox

aiohttp beautifulsoup beautifulsoup4 python sandbox webscraping

Last synced: 1 day ago
JSON representation

Python scripts for scraping data from Scrapethissite.com

Awesome Lists containing this project

README

        

## Scrapethissite.com - Web Scraping Exercises

This repository contains Python scripts of my attempt at web scraping exercises found on Scrapethissite.com.

### Exercises:

1. [Hockey Teams: Forms, Searching and Pagination](https://www.scrapethissite.com/pages/forms/): Scrape NHL team stats from a page with pagination and tables.

2. [Oscar Winning Films: AJAX and Javascript](https://www.scrapethissite.com/pages/ajax-javascript/): Scrape award-winning films information from a page with asynchronous content loading.

3. [Countries of the World: A Simple Example](https://www.scrapethissite.com/pages/simple/): Scrape country information from a single page.

### Installation:

1. Clone this repository:

```bash
git clone [email protected]:eliasbnk/web-scraping-sandbox.git
```

2. Navigate to the project directory:

```bash
cd web-scraping-sandbox
```

3. Create a virtual environment:

```bash
python -m venv venv
```

4. Activate the virtual environment:

- Windows:

```bash
venv\Scripts\activate
```

- Mac/Linux:

```bash
source venv/bin/activate
```

5. Install dependencies:

```bash
pip install -r requirements.txt
```

### Usage:

Run scripts to complete exercises. For example:

```bash
python teams.py
```

### License:

MIT License. See [LICENSE](LICENSE).