https://github.com/refeed/scrapy_facebooker
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
https://github.com/refeed/scrapy_facebooker
facebook scraper scraping scrapy spider
Last synced: 8 months ago
JSON representation
Collection of scrapy spiders which can scrape posts, images, and so on from public Facebook Pages.
- Host: GitHub
- URL: https://github.com/refeed/scrapy_facebooker
- Owner: refeed
- License: mit
- Created: 2017-07-15T12:51:11.000Z (almost 9 years ago)
- Default Branch: master
- Last Pushed: 2018-12-20T20:27:24.000Z (over 7 years ago)
- Last Synced: 2025-06-29T16:14:40.821Z (12 months ago)
- Topics: facebook, scraper, scraping, scrapy, spider
- Language: HTML
- Homepage:
- Size: 56.6 KB
- Stars: 26
- Watchers: 3
- Forks: 6
- Open Issues: 4
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
# scrapy_facebooker
[](https://travis-ci.org/refeed/scrapy_facebooker)
`scrapy_facebooker` is a collection of scrapy spiders which can scrape
posts, images, and so on from public Faceook Pages.
These spiders are intended to archive public Facebook pages, use it at your
own risk!
There are spiders which can operate normally without a Facebook account,
but there are also spiders which just can operate with a Facebook
Graph API access token.
## How to prepare
Before using these spiders you need to install all of its dependencies,
you can easily install it in one command:
```
pip install -r requirements.txt
```
This project is intended to run in Python 3.
## How to run
To run a spider, first you need to choose what spider you want to use,
you can look at spiders available at this project in
`/scrapy_facebooker/spiders/`.
For example, I want to use `facebook_post` spider and run it to scrape a public
page in Facebook with username `RHWEBsites`, and print the output to a file
named `output.json`:
```
$ scrapy crawl facebook_post -a target_username=RHWEBsites -o output.json
```
This is a name list of the spiders available in this repository:
- `facebook_event_graph`
- `facebook_post_graph`
- `facebook_photo_graph`
- `facebook_video_graph`
- `facebook_event`
- `facebook_post`
- `facebook_photo`
## License
Is available at `LICENSE.txt` in the root of this project.