https://github.com/princeegy/linkedin-job-scraper
A tool for scraping public available jobs from LinkedIn
https://github.com/princeegy/linkedin-job-scraper
beautifulsoup4 python requests scraping
Last synced: 3 months ago
JSON representation
A tool for scraping public available jobs from LinkedIn
- Host: GitHub
- URL: https://github.com/princeegy/linkedin-job-scraper
- Owner: PrinceEGY
- License: mit
- Created: 2022-09-27T01:59:59.000Z (over 2 years ago)
- Default Branch: main
- Last Pushed: 2022-10-02T17:36:58.000Z (over 2 years ago)
- Last Synced: 2024-12-30T19:39:42.490Z (5 months ago)
- Topics: beautifulsoup4, python, requests, scraping
- Language: Python
- Homepage: https://princeegy-linkedin-job-scraper-app-c48fzc.streamlitapp.com/
- Size: 247 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# LinkedIn-Job-Scraper
Scrape public available jobs on Linkedin using simple technologies.
For each job, the following fields are extracted:
> `Job Title`,
> `Organization Name`,
> `Country`,
> `City/State`,
> `Job Description`,
> `Post Time`,
> `Company Logo`,
> `Seniority Level`,
> `Employoment Type`,
> `Job Function`,
> `Industries`.
> `Job Link`.
>
## General info
- This project was created as part of Samsung Innovation Campus (SIC) training for training purposes and i'm not responsible for any misuses
- The application construct is located in the `app.py` file. This file uses the scraping methods from the `scraping_module` folder
- The methods and technologies used for scraping are so simple
## Technologies
The app is fully written in `Python 3.10.1`, the user interface was created using `streamlit 1.13.0`The whole scraping was done using `BeautifulSoup`, `requests`, `selenium`, `selenium` was used only for scrolling the page to cover more jobs, `requests` was used to request the jobs urls and `BeautifulSoup` was used to extract information from the DOM structure returned from the `selenium` and `requests`
## Scrapping methods
- Fast scraping: increase scraping speed significantly in exchange for scraping less information, the information that will not be fetched are (Job Description, Seniority level, Employment type, Job function and Industries)
- Slow scraping: Scrape all possible information for each job in exchange for taking more time in scraping**The reason behind fast scraping is it doesn't have to request each job link and it directly scrape the information from the search link**