Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/arv-anshul/yt-watch-history
Analyse your YouTube watch history using Data Science, ML and NLP.
https://github.com/arv-anshul/yt-watch-history
data-science docker docker-compose fastapi ml mlflow mlops mongodb nlp pydantic python3 streamlit youtube-api
Last synced: about 2 months ago
JSON representation
Analyse your YouTube watch history using Data Science, ML and NLP.
- Host: GitHub
- URL: https://github.com/arv-anshul/yt-watch-history
- Owner: arv-anshul
- License: mit
- Created: 2023-08-24T17:31:38.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-05-11T09:58:52.000Z (10 months ago)
- Last Synced: 2024-05-11T10:49:57.224Z (10 months ago)
- Topics: data-science, docker, docker-compose, fastapi, ml, mlflow, mlops, mongodb, nlp, pydantic, python3, streamlit, youtube-api
- Language: Python
- Homepage: https://arv-anshul.github.io/projects/yt-watch-history
- Size: 188 KB
- Stars: 9
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
# YouTube Watch History Analysis
This project analyzes a user's YouTube watch history data downloaded from Google Takeout. It provides insights into watch patterns, content preferences, and overall YouTube consumption.
> \[!IMPORTANT\]
>
> This was my first project where I explored MLOps concepts like FastAPI, Docker, and MLFlow.
> As the project grew in complexity, I found it challenging to maintain a clear development track.
>
> Therefore, I've decided to archive this version and rebuild it from scratch with a renewed focus on organization and maintainability.
>
> New project repo [**@arv-anshul/yt-watch-history-v2**](https://github.com/arv-anshul/yt-wach-history-v2)### Getting Your YouTube History Data
1. Go to the Google Takeout website: [Google Takeout](https://takeout.google.com/)
2. Sign in with your Google account.
3. Select "YouTube History" under "Choose data to export".
4. Choose **JOSN** file type and delivery options.
5. Click "Create export".
6. Wait for the export process to complete and download the file.##### Or refer to this blog at [dev.to](https://dev.to/ubershmekel/what-did-i-watch-most-on-youtube-1ol2).
### Benefits
- Gain valuable insights into your YouTube viewing habits.
- Discover your content preferences and identify areas of interest.
- Track your progress towards achieving your YouTube goals.
- Make informed decisions about your YouTube consumption.### Project's Notebooks
If you want to see my 📓 notebooks where I have done some interesting analysis on the datasets which I have used in this project then you can se them in my [**@arv-anshul/notebooks**](https://github.com/arv-anshul/notebooks/tree/main/yt-watch-history) github repository.
### Tech Stack
data:image/s3,"s3://crabby-images/c2e3f/c2e3f00b92994e88a14f59c9f7837f7dda6f93f3" alt="Docker"
data:image/s3,"s3://crabby-images/8019c/8019c5d7c8f9ef6932533f2fd3990b63ad515d0f" alt="FastAPI"
data:image/s3,"s3://crabby-images/780db/780db6a89a15bbed2f19dbcd91c3372407c3d3ac" alt="MLflow"
data:image/s3,"s3://crabby-images/6daa3/6daa3022fbbf0f216e9e1f899feacb8626c5cec9" alt="MongoDB"
data:image/s3,"s3://crabby-images/5e7c4/5e7c48bc6cbce3709a8d2f4cfb727e07225b0050" alt="NLTK"
data:image/s3,"s3://crabby-images/ff9a0/ff9a01fab1290ae69b93720576828b11daa5bf15" alt="Plotly"
data:image/s3,"s3://crabby-images/06c3c/06c3ccd579353817027f7c6d2851f65987d1fe65" alt="Polars"
data:image/s3,"s3://crabby-images/17a08/17a08cdcc88dbbbf02522dce6c2c1798412cb1b8" alt="Pydantic"
data:image/s3,"s3://crabby-images/07a89/07a8969784f957a3c54ab79bc571b427ba0dd466" alt="Ruff"
data:image/s3,"s3://crabby-images/60f6d/60f6dd67fdbe4c286ecf435250fb9c9f9d3662ef" alt="scikit-learn"
data:image/s3,"s3://crabby-images/da72b/da72b49fa47a2317907c64859c14cfd33b9338df" alt="Streamlit"
data:image/s3,"s3://crabby-images/1bed7/1bed7b03dab079d0edbf454b9c38e53e1920e82e" alt="YouTube Badge"## Project Setup Guide
This guide helps you set up and run this project using Docker Compose. The project consists of a frontend and backend service.
### Prerequisites
- [🍀 MongoDB Database URL](https://mongodb.com)
- [💥 Youtube Data v3 API Key](https://developers.google.com/youtube/v3/docs/)
- [🐳 Docker](https://www.docker.com/get-started)
- [🐳 Docker Compose](https://docs.docker.com/compose/install/)### Steps to Set Up
1. Clone the Repository:
```bash
git clone https://github.com/arv-anshul/yt-watch-history
```2. Configuration:
- Open the `docker-compose.yml` file in the project root.
- Set the following environment variables in the `frontend` service:
- `YT_API_KEY`: Replace `null` with your YouTube API key.
- `API_HOST`: Should match the name of the backend service **(`backend` in this case)**.
- `API_PORT`: Port number for the backend service **(default is `8001`)**.
- `LOG_LEVEL`: Logging level **(default is `INFO`)**.- Set the following environment variables in the `backend` service:
- `MONGODB_URL`: Replace `null` with your MongoDB URL.
- `API_PORT`: Port number for the backend service **(default is `8001`)**.
- `API_HOST`: Set to `"0.0.0.0"`.
- `LOG_LEVEL`: Logging level **(default is `INFO`)**.3. Build and Run:
```bash
docker-compose up --build
```4. Access the application:
- **Frontend:** Open a browser and go to `http://localhost:8501`.
- **Backend:** Accessed internally via the configured API endpoints. Or access locally at `http://0.0.0.0:8001`.> [!NOTE]
>
> - Frontend service runs on port `8501` locally.
> - Backend service runs on port `8001` locally.
> - Make sure no other services are running on these ports.
> - `/frontend` and `/backend` directories are mounted as volumes for the respective services.
> - `/frontend/data` and `/backend/ml_models` directories are mounted for persistent data storage.