https://github.com/theshubhamp/sample-florence2-object-detection
Sample: Object Detection over a Video Stream using Microsoft's Florence-2 Model
https://github.com/theshubhamp/sample-florence2-object-detection
florence-2 florence2 object-detection opencv
Last synced: 23 days ago
JSON representation
Sample: Object Detection over a Video Stream using Microsoft's Florence-2 Model
- Host: GitHub
- URL: https://github.com/theshubhamp/sample-florence2-object-detection
- Owner: theshubhamp
- License: mit
- Created: 2025-08-15T18:19:02.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-08-15T19:08:00.000Z (11 months ago)
- Last Synced: 2025-08-15T20:39:54.687Z (11 months ago)
- Topics: florence-2, florence2, object-detection, opencv
- Language: Python
- Homepage:
- Size: 41 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Sample: Object Detection over a Video Stream using Microsoft's Florence-2 Model
Florence-2 is an advanced vision foundation model that uses a prompt-based approach to handle a wide range of vision and vision-language tasks. Florence-2 can interpret simple text prompts to perform tasks like captioning, object detection, and segmentation. It leverages our FLD-5B dataset, containing 5.4 billion annotations across 126 million images, to master multi-task learning. The model's sequence-to-sequence architecture enables it to excel in both zero-shot and fine-tuned settings, proving to be a competitive vision foundation model.
This repository hosts a sample for Florence-2's Object Detection capabilities that:
- Uses OpenCV to read a Video Frames
- Runs each frame through Florence-2 to get bounding boxes
- Overlay bounding boxes on top of the original image
Here's a sample of how it looks live:

# Run on a Video File
```shell
uv run main.py ~/path/to/file.TS
```