{"id":28511607,"url":"https://github.com/armaggheddon/gstgeminivision","last_synced_at":"2025-10-09T12:16:38.208Z","repository":{"id":292413853,"uuid":"980836658","full_name":"Armaggheddon/GstGeminiVision","owner":"Armaggheddon","description":"Let your GStreamer pipelines describe what they see! 👁️‍🗨️ GstGeminiVision brings Google's Gemini Vision AI to your media streams for some serious (and fun!) video analysis. 🎥🤖✨","archived":false,"fork":false,"pushed_at":"2025-05-10T18:59:54.000Z","size":4720,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-08-06T10:44:23.133Z","etag":null,"topics":["c-programming","computer-vision","docker","gemini-api","google-gemini","gstreamer","gstreamer-plugins","python","video-analysis","vision-api"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Armaggheddon.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-05-09T19:56:28.000Z","updated_at":"2025-06-04T04:12:47.000Z","dependencies_parsed_at":null,"dependency_job_id":"baeb3df6-2f34-459e-ac55-0e576d463b3f","html_url":"https://github.com/Armaggheddon/GstGeminiVision","commit_stats":null,"previous_names":["armaggheddon/gstgeminivision"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Armaggheddon/GstGeminiVision","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Armaggheddon%2FGstGeminiVision","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Armaggheddon%2FGstGeminiVision/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Armaggheddon%2FGstGeminiVision/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Armaggheddon%2FGstGeminiVision/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Armaggheddon","download_url":"https://codeload.github.com/Armaggheddon/GstGeminiVision/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Armaggheddon%2FGstGeminiVision/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279001448,"owners_count":26083078,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-09T02:00:07.460Z","response_time":59,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["c-programming","computer-vision","docker","gemini-api","google-gemini","gstreamer","gstreamer-plugins","python","video-analysis","vision-api"],"created_at":"2025-06-09T00:07:29.423Z","updated_at":"2025-10-09T12:16:38.202Z","avatar_url":"https://github.com/Armaggheddon.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003cdiv id=\"top\"\u003e\u003c/div\u003e\n\u003cbr/\u003e\n\u003cbr/\u003e\n\u003cbr/\u003e\n\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/images/gstgeminivision_logo.png\"\u003e\n\u003c/p\u003e\n\u003ch1 align=\"center\"\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision\"\u003eGstGeminiVision\u003c/a\u003e\n\u003c/h1\u003e\n\u003cp align=\"center\"\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision/commits/master\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/last-commit/Armaggheddon/GstGeminiVision\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision\"\u003e\n    \u003cimg src=\"https://img.shields.io/badge/Maintained-yes-green.svg\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision/issues\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/issues/Armaggheddon/GstGeminiVision\"\u003e\n    \u003c/a\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision/blob/master/LICENSE\"\u003e\n    \u003cimg src=\"https://img.shields.io/github/license/Armaggheddon/GstGeminiVision\"\u003e\n    \u003c/a\u003e\n\u003c/p\u003e\n\u003cp align=\"center\"\u003e\n    Transform your media pipelines with AI-powered insights! 🎥🤖\n    \u003cbr/\u003e\n    \u003cbr/\u003e\n    \u003ca href=\"https://github.com/Armaggheddon/GstGeminiVision/issues\"\u003eReport Bug\u003c/a\u003e\n    •\n    \u003ca href=\"https://github.com/Armaggheddon//issues\"\u003eRequest Feature\u003c/a\u003e\n\u003c/p\u003e\n\n---\n\nEver wondered what a GStreamer pipeline would say if it could talk? Now it can (almost)! With **GstGeminiVision**, you can inject the power of Google's Gemini Vision API directly into your GStreamer media pipelines. Turn your video streams into insightful descriptions, automate content analysis, or just have some fun making your videos self-aware! 🤖🎬\n\nhttps://github.com/user-attachments/assets/91699974-dd6d-4957-9aa8-6e85610f3c85\n\n---\n\n\n## 🤨 What is GstGeminiVision?\n\nGstGeminiVision is a GStreamer plugin that acts as a bridge between your live video or image streams and the Google Gemini Vision API. It periodically captures frames, sends them to Gemini for analysis based on your prompt, and then makes the generated description available either as GStreamer metadata or through a GObject signal.\n\nImagine:\n*   Generating visual descriptions for accessibility purposes.\n*   Creating a security camera that describes what it sees.\n*   Building interactive art installations that react to visual input.\n*   ...the possibilities are as vast as your imagination (and Gemini's capabilities)!\n\n\n---\n\n## 🌟 Features\n\n- **Seamless GStreamer Integration:** Works like any other GStreamer element.\n- **Google Gemini Power:** Leverages the advanced multimodal capabilities of the Gemini API.\n- **Configurable Analysis:**\n    - Set your own **API Key**.\n    - Craft custom **prompts** to guide Gemini's analysis.\n    - Choose your preferred Gemini **model** (e.g., `gemini-2.0-flash`).\n    - Control the **analysis interval**.\n    - Fine-tune **generation parameters** (temperature, max tokens, top-P, top-K, stop sequences).\n- **Flexible Output:**\n    - Receive descriptions via a GObject **signal** (`description-received`).\n    - Embed descriptions directly into the GStreamer buffer as **metadata** (GstGeminiDescriptionMeta).\n- **Asynchronous Processing:** API calls are handled in a separate thread to keep your pipeline flowing smoothly.\n- **Example Applications:** Comes with C and Python examples to get you started quickly.\n- **Dockerized Environment:** Includes a Dockerfile for easy building and testing.\n\n---\n\n## 🎬 See it in Action!\n\n*   Show a `videotestsrc` pipeline running with the plugin.\n*   Display the console output from `example.py` or `gemini_vision_example.c` showing the descriptions.\n*   *Bonus:* If you have a more complex demo (e.g., overlaying text on video), showcase that!\n\n![GstGeminiVision in Action](docs/images/examplec_screen.png)\n![GstGeminiVision in Action](docs/images/examplepy_screen.png)\n\n```bash\ngst-launch-1.0 videotestsrc ! videoconvert ! geminivision api-key=\"YOUR_KEY\" prompt=\"What is this?\" ! fakesink\n```\n```console\nSetting pipeline to PLAYING state...\nPipeline running...\nPress Ctrl+C to quit\nPipeline state changed from NULL to READY\nPipeline state changed from READY to PAUSED\nPipeline state changed from PAUSED to PLAYING\n\n=================================\nFrame time: 0.000000000 (PTS: 0)\nDescription: That's a color bars test pattern, used to adjust color settings on television screens.\n\n=================================\n\n^CInterrupt received, stopping...\nCleaning up...\nPipeline stopped.\nCleaning up...\nPipeline stopped.\n```\n\n---\n\n## 🛠️ Getting Started\n\nReady to give your GStreamer pipelines a voice? Let's go!\n\n### Prerequisites\n\n- **GStreamer:** Core GStreamer libraries and development files (version 1.16+ recommended).\n- **Build Tools:** `meson`, `ninja`, `gcc` (or your C compiler), `pkg-config`.\n- **Dependencies for the Plugin:**\n    - `libglib2.0-dev`\n    - `libgstreamer-plugins-base1.0-dev`\n    - `libcurl4-openssl-dev` (or your system's cURL dev package)\n    - `libjson-c-dev`\n    - `libjpeg-dev`\n    - `libgirepository1.0-dev` \u0026 `gobject-introspection` (for GObject Introspection, used by Python example)\n- **Python 3** (for the Python example):\n    - `python3-gi`\n    - `python3-gst-1.0`\n- **A Google Gemini API Key:** Get yours from the [Google AI Studio](https://aistudio.google.com/apikey).\n\n### Building the Plugin\n\n1.  **Clone the repository (if you haven't already):**\n    ```bash\n    git clone https://github.com/Armaggheddon/GstGeminiVision.git\n    cd GstGeminiVision\n    ```\n\n2.  **Navigate to the plugin directory:**\n    ```bash\n    cd gst-gemini-plugin\n    ```\n\n3.  **Configure and build with Meson \u0026 Ninja:**\n    First, set up the build directory using Meson. The `--prefix=/usr` ensures that a subsequent install places files in standard system locations.\n    ```bash\n    meson setup build --prefix=/usr --buildtype=release --wipe\n    ```\n    Then, compile the plugin:\n    ```bash\n    ninja -C build\n    ```\n    Your compiled plugin shared object (e.g., `libgstgeminivision.so`) will be located in the `gst-gemini-plugin/build/src/` directory (or similar, depending on your Meson structure).\n\n4.  **Install the Plugin (Optional, but Recommended for System-Wide Access):**\n    To make the plugin and its development files available system-wide, run the install command (this usually requires root privileges):\n    ```bash\n    sudo ninja -C build install\n    ```\n    This command will copy the necessary files to standard system locations. Based on a typical installation with `--prefix=/usr`, the files will be placed as follows:\n    *   The plugin library: `libgstgeminivision.so` to `/usr/lib/x86_64-linux-gnu/gstreamer-1.0/`\n    *   GObject Introspection data:\n        *   `GstGeminiVision-1.0.gir` to `/usr/share/gir-1.0/`\n        *   `GstGeminiVision-1.0.typelib` to `/usr/lib/x86_64-linux-gnu/girepository-1.0/`\n    *   Pkg-config file: `gstgeminivision.pc` (or similar) to `/usr/lib/x86_64-linux-gnu/pkgconfig/`\n\n    *(Note: The exact paths like `x86_64-linux-gnu` might vary slightly based on your Linux distribution's multiarch setup. The `--prefix` you used with `meson setup` determines the base for these paths.)*\n\n    After installation, GStreamer should be able to automatically discover the plugin. You might need to clear GStreamer's cache if it doesn't pick it up immediately (though `ninja install` often triggers this).\n\n### Running the Examples\n\nMake sure GStreamer can find your newly built plugin. You can either install it system-wide (`sudo ninja -C build install` - requires Meson install target to be configured) or, more easily for development, set the `GST_PLUGIN_PATH`:\n\n```bash\nexport GST_PLUGIN_PATH=$(pwd)/gst-gemini-plugin/build:$GST_PLUGIN_PATH\n# For Python introspection (if not installed system-wide):\nexport GI_TYPELIB_PATH=$(pwd)/gst-gemini-plugin/build:$GI_TYPELIB_PATH\n```\n\nAnd don't forget your API key!\n```bash\nexport GST_GEMINI_API_KEY=\"YOUR_API_KEY\"\n```\nOr by prepending it when running the examples as `GST_GEMINI_API_KEY=\"YOUR_API_KEY\" ./example.py`.\n\n### C Example\nNavigate to the examples directory and compile/run:\n```bash\ncd examples\ngcc gemini_vision_example.c -o gemini_vision_example $(pkg-config --cflags --libs gstreamer-1.0 glib-2.0 gobject-2.0)\n./gemini_vision_example\n```\nor\n```bash\ngcc gemini_vision_example.c -o gemini_vision_example $(pkg-config --cflags --libs gstreamer-1.0 glib-2.0 gobject-2.0)\n```\n\n### Python Example\nNavigate to the examples directory and run:\n```bash\ncd examples\npython3 example.py\n```\n\nYou should see descriptions from Gemini printed to the console! 🚀\n\n---\n\n## 🐳 Docker: Your AI-Powered Media Lab in a Box!\n\nWant to dive straight into the action without wrestling with dependencies? Our Docker setup is your golden ticket! 🎟️ It's like having a pre-configured media lab, ready to build, test, and run GstGeminiVision with just a few commands. No more \"it works on my machine\" – it'll work in *this* machine!\n\n**Step 1: Build the All-Powerful Docker Image**\n\nFirst, conjure up your Docker image. This image contains all the tools and magic needed. From your `GstGeminiVision` project root:\n```bash\ndocker build -t gst-gemini-vision .\n```\n*(Psst! If you've already built it, you can skip this step unless you've made changes to the Dockerfile or the plugin build process itself.)*\n\n**Step 2: Unleash the Entrypoint Script!**\n\nThe Docker image comes with a super-handy `entrypoint.sh` script that acts as your mission control. You tell it what to do, and it handles the nitty-gritty. Here are your commands, Captain:\n\n- **`build` (Default Action): Compile the Mighty Plugin!**\n    Just want to build the main `geminivision` plugin? This is your command. It compiles the plugin but doesn't install it system-wide in the container. Perfect for a quick compilation check.\n    ```bash\n    # Run from your GstGeminiVision project root\n    docker run --rm \\\n        --volume $(pwd)/gst-gemini-plugin:/builder \\\n        gst-gemini-vision build\n    ```\n\n- **`build-examples`: Build the Plugin \u0026 The Examples!**\n    This action first ensures the main `geminivision` plugin is built and installed *inside the container*. Then, it gallops over to your examples directory (`/examples` in the container) and builds them (either using the Makefile or compiling C files directly).\n    ```bash\n    # Run from your GstGeminiVision project root\n    docker run --rm \\\n        --volume $(pwd)/gst-gemini-plugin:/builder \\\n        --volume $(pwd)/examples:/examples \\\n        gst-gemini-vision build-examples\n    ```\n\n- **`test-examples \u003cexample_script_name\u003e`: The Grand Showcase!**\n    This is where the real fun begins! This action:\n    1.  If you specify a C example (e.g., `gemini_vision_example.c`), it compiles it on the fly if not already built.\n    1.  Runs your chosen example script (C or Python)!\n    ✨ **Requires `GST_GEMINI_API_KEY`!** ✨\n    ```bash\n    # Example for the C script (gemini_vision_example.c):\n    # Run from your GstGeminiVision project root\n    docker run --rm \\\n        -e GST_GEMINI_API_KEY=\"YOUR_ACTUAL_API_KEY\" \\\n        -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix \\\n        --volume $(pwd)/gst-gemini-plugin:/builder \\\n        --volume $(pwd)/examples:/examples \\\n        gst-gemini-vision test-examples gemini_vision_example.c\n\n    # Example for the Python script (gemini_vision_example.py):\n    # Run from your GstGeminiVision project root\n    docker run --rm \\\n        -e GST_GEMINI_API_KEY=\"YOUR_ACTUAL_API_KEY\" \\\n        -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix \\\n        --volume $(pwd)/gst-gemini-plugin:/builder \\\n        --volume $(pwd)/examples:/examples \\\n        gst-gemini-vision test-examples gemini_vision_example.py\n    ```\n    Don't forget to replace `\"YOUR_ACTUAL_API_KEY\"`! The X11 forwarding lines are for examples that pop up a video window.\n\n- **`shell`: Your Personal Command Deck!**\n    Want to poke around inside the container? Need to run some custom commands or debug something? The `shell` action drops you right into an interactive command line.\n    ```bash\n    # Run from your GstGeminiVision project root\n    docker run -it --rm \\\n        -e GST_GEMINI_API_KEY=\"YOUR_ACTUAL_API_KEY\" `# Optional, but good to have if you plan to test` \\\n        -e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X1_unix \\\n        --volume $(pwd)/gst-gemini-plugin:/builder \\\n        --volume $(pwd)/examples:/examples \\\n        gst-gemini-vision shell\n    ```\n    Inside the shell, your plugin source will be at `/builder` and examples at `/examples`. The main plugin won't be installed by default with this action alone, but the `entrypoint.sh` script itself is available at `/entrypoint.sh` if you want to manually trigger parts of its logic, or use this shell after running `test-examples` to inspect a fully set-up environment.\n\n\u003e [!NOTE]\n\u003e Using `-e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix` allows GUI elements to display on your host machine. This requires running the command `xhost +` on your host Linux machine to allow the Docker container to access your display. If you're using a different display server or setup, you might need to adjust these flags accordingly.\n\n**Important Notes for Docker Adventures:**\n- **Volume Mounts are Key:** The `--volume $(pwd)/...:/...` flags map directories from your computer into the Docker container.\n    - `/builder`: Points to your `gst-gemini-plugin` directory. This is where the main plugin source code lives.\n    - `/examples`: Points to your `examples` directory.\n- **API Key:** For `test-examples`, the `GST_GEMINI_API_KEY` environment variable (`-e`) is crucial. The plugin won't talk to Google without it!\n- **GUI Display:** If your examples use `autovideosink` or any other element that creates a window, you'll need the `-e DISPLAY=$DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix` lines (and sometimes `xhost +local:docker` on your host Linux machine) to see the output.\n\nWith these commands, you're all set to explore the wonders of GstGeminiVision without breaking a sweat over setup!\n\n---\n\n## ⚙️ Configuration (Plugin Properties)\n\nThe `geminivision` element has several properties you can configure. For a **full, detailed list** of all properties, their types, default values, ranges, and descriptions, please refer to the output of `gst-inspect-1.0`:\n\n➡️ **[View Full Plugin Details (gst-inspect-1.0 output)](docs/gst_inspect_geminivision.txt)** ⬅️\n\nYou can also generate this information yourself by running:\n```bash\ngst-inspect-1.0 geminivision\n```\n\nHere's a summary of the key properties:\n\n- `api-key` (string): Your Google Gemini API Key (Mandatory!).\n- `prompt` (string): The text prompt to guide Gemini's analysis. Default: \"Describe what you see in this image\".\n- `model-name` (string): The Gemini model to use. Default: \"gemini-2.0-flash-latest\".\n- `analysis-interval` (double): Time in seconds between analyses. Default: 5.0.\n- `output-metadata` (boolean): If TRUE, output description as GstMeta. If FALSE (default), emit a signal.\n- **Generation Config**:\n    - `stop-sequences` (GStrv/list of strings): Sequences where the API will stop generating.\n    - `temperature` (double): Controls randomness (0.0-2.0). Default: 1.0.\n    - `max-output-tokens` (int): Max tokens to generate. Default: 800.\n    - `top-p` (double): Nucleus sampling probability. Default: 0.8.\n    - `top-k` (int): Sample from the K most likely tokens. Default: 10.\n\nYou can set these using `gst-launch-1.0` or programmatically in your C/Python applications.\n\nExample with `gst-launch-1.0`:\n```bash\ngst-launch-1.0 videotestsrc pattern=ball ! videoconvert ! \\\n    geminivision api-key=\"$GST_GEMINI_API_KEY\" \\\n                   prompt=\"Is there a ball in this image? Answer yes or no.\" \\\n                   output-metadata=false \\\n                   temperature=0.2 \\\n! videoconvert ! autovideosink\n```\n\n## 🙌 Contributing\nContributions are welcome! Whether it's bug fixes, new features, or documentation improvements, feel free to open an issue or submit a pull request.\n\n\n## 📜 License\nThis project is licensed under the MIT License - see the LICENSE.md file for details.\n\n--- \n\nHappy Hacking and may your pipelines be ever insightful! 💡\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farmaggheddon%2Fgstgeminivision","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Farmaggheddon%2Fgstgeminivision","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Farmaggheddon%2Fgstgeminivision/lists"}