{"id":20058699,"url":"https://github.com/roboflow/clip_video_app","last_synced_at":"2025-07-06T17:07:51.367Z","repository":{"id":202664109,"uuid":"707871059","full_name":"roboflow/clip_video_app","owner":"roboflow","description":"Flask-based web application designed to compare text and image embeddings using the CLIP model.","archived":false,"fork":false,"pushed_at":"2024-01-22T22:13:25.000Z","size":12045,"stargazers_count":22,"open_issues_count":4,"forks_count":4,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-06-30T00:14:55.869Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/roboflow.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-20T21:16:39.000Z","updated_at":"2025-01-24T11:25:21.000Z","dependencies_parsed_at":null,"dependency_job_id":"c4423456-a006-43a9-bf5d-ebbd62c63090","html_url":"https://github.com/roboflow/clip_video_app","commit_stats":null,"previous_names":["roboflow/clip_video_app"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/roboflow/clip_video_app","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roboflow%2Fclip_video_app","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roboflow%2Fclip_video_app/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roboflow%2Fclip_video_app/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roboflow%2Fclip_video_app/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/roboflow","download_url":"https://codeload.github.com/roboflow/clip_video_app/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/roboflow%2Fclip_video_app/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":263940313,"owners_count":23533012,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-11-13T13:03:14.870Z","updated_at":"2025-07-06T17:07:51.350Z","avatar_url":"https://github.com/roboflow.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CLIP Video Investigator\n\n![clip_video_investigator](media/overfiewGif.gif)\n\n## Overview\n\nCLIP Video Investigator is a Flask-based web application designed to compare text and image embeddings using the CLIP model. The application integrates OpenCV for video processing and Plotly for data visualization to accomplish the following:\n\n1. Play a video in a web browser.\n2. Pause and resume video playback.\n3. Compare CLIP embeddings of video frames with text embeddings.\n4. Visualize the similarity between text and image embeddings in real-time using a Plotly plot.\n5. Jump to specific frames by clicking on the Plotly plot.\n\nYoutube video: https://www.youtube.com/watch?v=XllrtZnPL6M\n\n### Why This is Useful\n\nUnderstanding the relationship between text and image embeddings can provide insights into how well a model generalizes across modalities. By plotting these values in real-time, researchers and engineers can:\n\n- Identify key frames where text and image embeddings are highly aligned or misaligned.\n- Debug and fine-tune the performance of multimodal models.\n- Gain insights into the temporal evolution of embeddings in video data.\n- Enable more effective search and retrieval tasks for video content.\n\n## Features\n\n- **Video Playback**: Uses OpenCV to read video frames and displays them in the web browser.\n  \n- **Play/Pause**: Allows the user to start and stop video playback.\n\n- **Data Visualization**: Uses Plotly to plot data related to the video frames.\n\n- **Interactive Plot**: Allows the user to click on the plot to jump to specific frames in the video.\n\n- **Reset Functionality**: Resets the application to its initial state.\n\n- **Embedding Caching**: Pickle files of the text and image frame embeddings are saved for each video in the `/embeddings` folder. This allows for quicker subsequent analysis by avoiding the need to regenerate these embeddings.\n\n## Configuration\n\nA `config.yaml` file is used to specify various settings for the application:\n\n```yaml\nroboflow_api_key: \"\"  # Roboflow API key\nvideo_path: \"\"  # Path to video file\nCLIP:\n  - wall\n  - tile wall\n  - large tile wall\n```\n\n- `roboflow_api_key`: Your API key for Roboflow.\n- `video_path`: The path to the video file you want to analyze.\n- `CLIP`: A list of text inputs for which you want to generate CLIP embeddings.\n\n## Folder Layout\n\n```\nclip_investigator/\n├── config.yaml\n├── scripts/\n│   └── example.pkl\n├── embeddings/\n│   └── clip_app.py\n│   └── clip_functions.py\n├── static/\n│   ├── css/\n│   │   └── style.css\n│   └── js/\n│       └── main.js\n└── templates/\n    └── index.html\n```\n\n- `clip_app.py`: The main Flask application file.\n- `config.yaml`: Configuration file for specifying settings.\n- `embeddings/`: Folder where pickle files of text and image embeddings are stored.\n- `static/`: Contains static files like CSS and JavaScript.\n- `templates/`: Contains HTML templates.\n\n## Installation\n\n### Prerequisites\n\n- Python 3.x\n- Virtualenv (optional but recommended)\n\n### Steps\n\n1. Clone the repository.\n    ```bash\n    git clone https://github.com/roboflow/clip_video_app.git\n    ```\n\n2. Navigate to the project directory.\n    ```bash\n    cd  clip_video_app\n    ```\n\n3. (Optional) Create a virtual environment.\n\n4. Install the dependencies.\n    ```bash\n    pip install -r requirements.txt\n    ```\n\n## Usage\n**You must also be running the roboflow inference server localy!**\n\n0. Update the `config.yaml` file with your Roboflow API key and the path to your video file (or use sample file in /data folder).\n\n1. Start the Flask application.\n    ```bash\n    python scripts/clip_app.py\n    ```\n\n2. Open a web browser and navigate to `http://localhost:5000`.\n\n3. Use the \"Start\" and \"Stop\" buttons to control video playback.\n\n4. View real-time data related to the video in the Plotly plot below the video.\n\n5. Click on the Plotly plot to jump to specific video frames.\n\n## Troubleshooting\n\n- **WebSocket Errors**: If you encounter WebSocket errors, check the browser console for specific error messages. The application has built-in error handling to attempt reconnections.\n\n- **Plotly Click Events**: If click events are not detected on the Plotly plot after a reset, reload the page.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froboflow%2Fclip_video_app","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Froboflow%2Fclip_video_app","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Froboflow%2Fclip_video_app/lists"}