Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/vncsmyrnk/moviescraper
Web Crawler that collects movie critic avaliations data
https://github.com/vncsmyrnk/moviescraper
Last synced: 11 days ago
JSON representation
Web Crawler that collects movie critic avaliations data
- Host: GitHub
- URL: https://github.com/vncsmyrnk/moviescraper
- Owner: vncsmyrnk
- License: gpl-3.0
- Created: 2024-04-21T17:53:39.000Z (7 months ago)
- Default Branch: main
- Last Pushed: 2024-04-23T11:32:20.000Z (7 months ago)
- Last Synced: 2024-04-23T23:50:04.519Z (7 months ago)
- Language: Python
- Homepage:
- Size: 3.01 MB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge&logo=python&logoColor=ffdd54)
# Movie Scraper
This is a project based on python and scrapy that scrapes the pages of the [Metacritic](https://www.metacritic.com/) website to colect data about movies and critics.
All data belongs to the platforms.
## Example output
```json
[
{...},
{
"title": "Fanny and Alexander (re-release)",
"avg_score": "100",
"year": "2004",
"description": "Set in Sweden in the early 20th century, this film focuses on the young children of a wealthy, theatrical family.",
"movie_uri": "/movie/fanny-and-alexander-re-release/",
"scores": [
{
"reviewer_name": "Chicago Tribune",
"score": "100"
},
{
"reviewer_name": "Chicago Sun-Times",
"score": "100"
},
{
"reviewer_name": "Boston Globe",
"score": "100"
},
{
"reviewer_name": "Variety",
"score": "100"
},
{
"reviewer_name": "Chicago Reader",
"score": "100"
},
{
"reviewer_name": "Village Voice",
"score": "90"
},
{
"reviewer_name": "TV Guide Magazine",
"score": "90"
}
],
"platform": "metacritic"
},
{...},
]
```The complete scraped data is at [reviews_formatted.json](https://raw.githubusercontent.com/vncsmyrnk/moviescraper/main/reviews_formatted.json).
## Run with docker
```bash
git clone moviescraper && cd moviescraper
docker run -it --rm -v "$(pwd)":/var/app -w /var/app python:3.9 bash
pip install requirements.txt -r # On container
scrapy crawl metacritic -O reviews.json # On container
```