https://github.com/princeegy/linkedin-job-scraper

A tool for scraping public available jobs from LinkedIn
https://github.com/princeegy/linkedin-job-scraper

beautifulsoup4 python requests scraping

Last synced: 3 months ago
JSON representation

A tool for scraping public available jobs from LinkedIn

Host: GitHub
URL: https://github.com/princeegy/linkedin-job-scraper
Owner: PrinceEGY
License: mit
Created: 2022-09-27T01:59:59.000Z (over 2 years ago)
Default Branch: main
Last Pushed: 2022-10-02T17:36:58.000Z (over 2 years ago)
Last Synced: 2024-12-30T19:39:42.490Z (5 months ago)
Topics: beautifulsoup4, python, requests, scraping
Language: Python
Homepage: https://princeegy-linkedin-job-scraper-app-c48fzc.streamlitapp.com/
Size: 247 KB
Stars: 1
Watchers: 2
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # LinkedIn-Job-Scraper

Scrape public available jobs on Linkedin using simple technologies.

For each job, the following fields are extracted: 

> `Job Title`, 

> `Organization Name`, 

> `Country`, 

> `City/State`, 

> `Job Description`, 

> `Post Time`, 

> `Company Logo`, 

> `Seniority Level`, 

> `Employoment Type`, 

> `Job Function`,

> `Industries`.

> `Job Link`.

> 

## General info

- This project was created as part of Samsung Innovation Campus (SIC) training for training purposes and i'm not responsible for any misuses

- The application construct is located in the `app.py` file. This file uses the scraping methods from the `scraping_module` folder

- The methods and technologies used for scraping are so simple

![App overview](https://github.com/PrinceEGY/LinkedIn-Job-Scraper/blob/main/images/app-img.png)

## Technologies

The app is fully written in `Python 3.10.1`, the user interface was created using `streamlit 1.13.0`

The whole scraping was done using `BeautifulSoup`, `requests`, `selenium`, `selenium` was used only for scrolling the page to cover more jobs, `requests` was used to request the jobs urls and `BeautifulSoup` was used to extract information from the DOM structure returned from the `selenium` and `requests` 

## Scrapping methods

- Fast scraping: increase scraping speed significantly in exchange for scraping less information, the information that will not be fetched are (Job Description, Seniority level, Employment type, Job function and Industries)

- Slow scraping: Scrape all possible information for each job in exchange for taking more time in scraping

**The reason behind fast scraping is it doesn't have to request each job link and it directly scrape the information from the search link**

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/princeegy/linkedin-job-scraper

Awesome Lists containing this project

README