Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/brutalsavage/facebook-post-scraper
Facebook Post Scraper 🕵️🖱️
https://github.com/brutalsavage/facebook-post-scraper
beautifulsoup4 facebook-scraper facebook-scraper-software facebook-scraper-tool python selenium-webdriver
Last synced: about 2 months ago
JSON representation
Facebook Post Scraper 🕵️🖱️
- Host: GitHub
- URL: https://github.com/brutalsavage/facebook-post-scraper
- Owner: brutalsavage
- License: gpl-3.0
- Created: 2019-07-31T22:51:56.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2023-01-03T13:41:25.000Z (over 1 year ago)
- Last Synced: 2024-06-28T10:36:51.014Z (3 months ago)
- Topics: beautifulsoup4, facebook-scraper, facebook-scraper-software, facebook-scraper-tool, python, selenium-webdriver
- Language: Python
- Homepage:
- Size: 37.1 KB
- Stars: 311
- Watchers: 15
- Forks: 114
- Open Issues: 31
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Facebook Scraper Selenium
Scrape Facebook Public Posts without using Facebook API
## What It can Do
- Scrape Public Post Text
- Raw Text
- Picture
- Link
- Scrape Likes and Top 3 React Numbers
- Scrape Public Post Comments
- Links in Comments
- Pictures in Comments## Install Requirements
Please make sure chrome is installed and ```chromedriver``` is placed in the same directory as the file
Find out which version of ```chromedriver``` you need to download in this link [Chrome Web Driver](http://chromedriver.chromium.org/downloads).
Place your Facebook login in info into ```facebook_credentials.txt```
Optional:
To download all videos from a specific page, make sure that [youtube-dl](https://github.com/ytdl-org/youtube-dl/) binary is downloaded locally.```sh
pip install -r requirements.txt
```## Usage
#### 1. Use scraper.py to print to screen or to file
```
usage: scraper.py [-h] -page PAGE -len LEN [-infinite INFINITE] [-usage USAGE]
[-comments COMMENTS]Facebook Page Scraper
optional arguments:
-h, --help show this help message and exitrequired arguments:
-page PAGE, -p PAGE The Facebook Public Page you want to scrape
-len LEN, -l LEN Number of Posts you want to scrapeoptional arguments:
-infinite INFINITE, -i INFINITE
Scroll until the end of the page (1 = infinite)
(Default is 0)
-usage USAGE, -u USAGE
What to do with the data: Print on Screen (PS), Write
to Text File (WT) (Default is WT)
-comments COMMENTS, -c COMMENTS
Scrape ALL Comments of Posts (y/n) (Default is n).
When enabled for pages where there are a lot of
comments it can take a while```
#### 2. Use ```extract()``` to grab list of posts for additional parsing
```
from scraper import extractlist = extract(page, len, etc..)
# do what you want with the list
```Return value of ```extract()``` :
```python
[
{'Post': 'Text text text text text....',
'Link' : 'https://link.com',
'Image' : 'https://image.com',
'Comments': {
'name1' : {
'text' : 'Text text...',
'link' : 'https://link.com',
'image': 'https://image.com'
}
'name2' : {
...
}
...
},
'Reaction' : { # Reaction only contains the top3 reactions
'LIKE' : int(number_of_likes),
'HAHA' : int(number_of_haha),
'WOW' : int(number_of_wow)
}}
...
]
```#### 3. Use ```download_entire_page_videos``` to download all videos from a specific Facebook page
Example:
```
download_entire_page_videos.py --chromedriver chromedriver.exe --youtube_dl youtube-dl.exe --fbpage https://www.facebook.com/groups/[GROUP_ID]/ --numofposts 100
```### Note:
- Please use this code for Educational purposes only
- Will continue to add additional features and data
- ~comment chains scraping~
- comment reaction scraping
- different comment display (Most Relevant, New, etc)