https://github.com/alsongard/movie_series_scrapper
scrap data from yts.mx website and analyzing using streamlit
https://github.com/alsongard/movie_series_scrapper
bash-script matplot pandas python3 scrapy streamlit
Last synced: 4 months ago
JSON representation
scrap data from yts.mx website and analyzing using streamlit
- Host: GitHub
- URL: https://github.com/alsongard/movie_series_scrapper
- Owner: alsongard
- Created: 2024-12-31T23:39:14.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2025-02-19T23:41:33.000Z (4 months ago)
- Last Synced: 2025-02-20T00:32:08.124Z (4 months ago)
- Topics: bash-script, matplot, pandas, python3, scrapy, streamlit
- Language: Python
- Homepage:
- Size: 61.5 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# YTS MOVIE SCRAPPER AND ANALYZER USING STREAMLIT
[](https://wakatime.com/badge/user/ca37100b-7f0f-4ae1-947c-ff595165e688/project/08ce1bb1-8e14-4eda-830c-a0dbd259b83b)
## Installation
1. Clone the repository:
```sh
git clone https://github.com/yourusername/moviescrapper.git
cd moviescrapper
```2. Create a virtual environment and activate it:
```sh
python -m venv venv
source venv/bin/activate # On Windows use `venv\Scripts\activate`
```3. Install the required packages:
```sh
pip install -r requirements.txt
```4. Set up environment variables:
Create a `.env` file in the root directory and add your [SCRAPEOPS_API_KEY](http://_vscodecontentref_/10):
```env
SCRAPEOPS_API_KEY=your_api_key_here
```## Usage
1. Run the Scrapy spider to scrape movie data:
```sh
scrapy crawl moviespider
```2. After scraping, the data will be saved in [movieData.csv](http://_vscodecontentref_/11). You can analyze this data using Python.
3. To visualize the data using Streamlit, run:
```sh
streamlit run app.py
```## Files
- [scrapy.cfg](http://_vscodecontentref_/12): Scrapy configuration file.
- [items.py](http://_vscodecontentref_/13): Defines the models for the scraped items.
- [middlewares.py](http://_vscodecontentref_/14): Contains custom middleware for the project.
- [pipelines.py](http://_vscodecontentref_/15): Processes the scraped items.
- [settings.py](http://_vscodecontentref_/16): Scrapy settings for the project.
- [moviespider.py](http://_vscodecontentref_/17): The spider that scrapes movie data from YTS.
- [movieData.csv](http://_vscodecontentref_/18): The CSV file where the scraped movie data is stored.
- [new_movies.csv](http://_vscodecontentref_/19): Additional movie data for analysis.## Streamlit App
The Streamlit app provides a user-friendly interface to visualize and analyze the scraped movie data. It includes features such as:
- Displaying movie details
- Filtering movies by category, rating, and release year
- Visualizing data trends and distributions[streamlit-movie_streamlit-2025-02-13-03-02-43.webm](https://github.com/user-attachments/assets/0e6be360-a233-4c94-9120-306c3dd3378e)
## Contributing
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
## License
This project is licensed under the MIT License. See the LICENSE file for details.
## Acknowledgements
- [Scrapy](https://scrapy.org/)
- [Streamlit](https://streamlit.io/)
- [YTS](https://yts.mx/)