https://github.com/invictusaman/glassdoor-webscraper

I designed a scraping 🕸️ tool to extract job posting data from Glassdoor. This scraping tool will return job title, company name, job id, location, salary, language and skills and many more.
https://github.com/invictusaman/glassdoor-webscraper

data-analyst data-collection glassdoor glassdoor-scraper python3 webscraping

Last synced: 8 months ago
JSON representation

I designed a scraping 🕸️ tool to extract job posting data from Glassdoor. This scraping tool will return job title, company name, job id, location, salary, language and skills and many more.

Host: GitHub
URL: https://github.com/invictusaman/glassdoor-webscraper
Owner: invictusaman
License: mit
Created: 2024-08-25T03:54:57.000Z (about 1 year ago)
Default Branch: main
Last Pushed: 2024-08-25T03:59:30.000Z (about 1 year ago)
Last Synced: 2025-01-09T22:49:12.612Z (9 months ago)
Topics: data-analyst, data-collection, glassdoor, glassdoor-scraper, python3, webscraping
Language: Python
Homepage:
Size: 1.75 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Glassdoor Scraper
I designed a scraping 🕸️ tool to extract job posting data from Glassdoor. This scraping tool will return job title, company name, job id, location, salary, language and skills and many more.

*It was easier to extract Glassdoor data compared to Indeed because job postings in Glassdoor are organized, properly labelled and glassdoor also provides estimated salary (if not present).*

**Thank you Glassdoor**

## Step 1: Install dependencies

Install required dependencies in your project folder.

```
pip install -r requirements.txt
```

## Step 2: Run Glassdoor_Scraper.py

Make sure you have Chrome ⬇️ latest version installed in your system. This step creates `scraped_glassdoor_job_file.csv` with all columns. *You can check the sample output in this repository itself, I extracted for Data Analyst Position in Canada.*

### Further Work:

Currently, only one web address can be processed during each run. Create a list of different addresses, and pass the index value; the tool should fetch each url one by one, and scrap accordingly, and create a final output or multiple outputs.

#### Follow my data-analyst journey: [Portfolio_Link](https://www.amanbhattarai.com)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/invictusaman/glassdoor-webscraper

Awesome Lists containing this project

README