Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gurbaaz27/linkedin-comments-scraper
Script to scrape comments (including name, profile link, pfp, designation, email(if present), and comment) from a LinkedIn post from the URL of the post.
https://github.com/gurbaaz27/linkedin-comments-scraper
linkedin linkedin-comments-scraper linkedin-post python python3 scraper selenium selenium-python selenium-webdriver webscraping
Last synced: 4 months ago
JSON representation
Script to scrape comments (including name, profile link, pfp, designation, email(if present), and comment) from a LinkedIn post from the URL of the post.
- Host: GitHub
- URL: https://github.com/gurbaaz27/linkedin-comments-scraper
- Owner: gurbaaz27
- Created: 2020-08-04T18:46:54.000Z (over 4 years ago)
- Default Branch: master
- Last Pushed: 2024-06-08T08:20:59.000Z (8 months ago)
- Last Synced: 2024-09-29T02:05:36.831Z (4 months ago)
- Topics: linkedin, linkedin-comments-scraper, linkedin-post, python, python3, scraper, selenium, selenium-python, selenium-webdriver, webscraping
- Language: Python
- Homepage:
- Size: 174 KB
- Stars: 77
- Watchers: 2
- Forks: 20
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# LinkedIn Comments Scraper
## Updates 🚀
- [x] Fix utf-8 encoding error
- [x] Shift configuration from Config.py to config.json
- [x] Add support for profile pictures of commentors
- [ ] Add support for pulling images in comment section
- [x] Add headless support
- [ ] Chrome extension
- [x] Show all replies support## Brief Overview
If you have used LinkedIn, you must have encountered many posts regarding helpful resources, and they would ask email address,to which replies in comments are usually like
```bash
Interested!```
I don't like this thing, like you can already share link in the post itself :angry:. **Nevertheless**, I thought to automate this work of collecting all emails.
All the comments, with columns of
- Name of the person commented
- Designation of the person
- Profile Picture URL
- Email address (if present in comment)
- Comment (UTF-8 encoded)are stored in a csv.
## Requirements
- [python](https://www.python.org/) (recommended : >= 3.7)
- [selenium](https://pypi.org/project/selenium/)- [web-driver](https://pypi.org/project/webdriver-manager/)
Install the dependencies:
```bash
pip install -r requirements.txt
```## Usage
- In `config.json`, enter the required url of LinkedIn Post in **post_url** variable:
```python
post_url = ""
```
> *__NOTE__*: If you forget to enter here, it will be asked during execution of script itself.- You can also change csv file name (in which scraped data will be stored) and dir name (in which profile pics will be downloaded) in `config.json` .
- Help:
```
usage: main.py [-h] [--headless] [--show-replies] [--download-pfp]Linkedin Scraping.
options:
-h, --help show this help message and exit
--headless Go headless browsing
--show-replies Load all replies to comments
--download-pfp Download profile pictures of commentors
```
> *__NOTE__*: Even if the flag `--download-pfp` isn't provided, URLs of image would get stored in the output csv.- Run the script:
```bash
python main.py
```Login email and password for your LinkedIn account will be asked and process would start.
### Configuration
`config.json` contains various fields, containing information about scraping the HTML elements by name or xpath, and other metadata
Suggestions and contributio ns are alwasy welcome!:smile:
## Star History
[![Star History Chart](https://api.star-history.com/svg?repos=gurbaaz27/linkedin-comments-scraper&type=Date)](https://star-history.com/#gurbaaz27/linkedin-comments-scraper&Date)