Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/pb319/scrapify

The repository contains some beginner-friendly resources to help you start web-scraping using Beautiful Soup.
https://github.com/pb319/scrapify

beautifulsoup python webscraping

Last synced: 2 months ago
JSON representation

The repository contains some beginner-friendly resources to help you start web-scraping using Beautiful Soup.

Awesome Lists containing this project

README

        

### Table of Contents
- [Resources](https://github.com/pb319/Scrapify#resource)
- [Objective](https://github.com/pb319/Scrapify#objective)
- [Approach](https://github.com/pb319/Scrapify#objective)
- [Output Files](https://github.com/pb319/Scrapify#output-files)

#### Resource:
Youtube Video Link: [Click Here](https://www.youtube.com/watch?v=XVv6mJpFOb0&t=2242s)

#### Objective:
- Get first-hand experience with how to parse HTML(tags, classes) through `Beautiful Soup` to find single/multiple elements.
- Create a database of job descriptions, and specifications available on `www.timesjobs.com`.

#### Approach:
- We used a synthetic simple HTML page to understand how `Beautiful Soup` works. [HTML File](https://github.com/pb319/Scrapify/blob/main/home.html)
- Fetch multiple elements (`Posted, Company Name, Skill Requirements, More Info`) through API request.
- Finally export it as a CSV formatted file.

#### Output Files:
- [Primary_Script](https://github.com/pb319/Scrapify/blob/main/synthetic.ipynb)
- [Python Script](https://github.com/pb319/Scrapify/blob/main/main.py)
- [CSV File](https://github.com/pb319/Scrapify/blob/main/output.csv)