https://github.com/shaku-med/real-time-object-detect
Computer vision pipeline
https://github.com/shaku-med/real-time-object-detect
license mit-license opencv production-ready python3 status yolov4 yolov8
Last synced: about 2 months ago
JSON representation
Computer vision pipeline
- Host: GitHub
- URL: https://github.com/shaku-med/real-time-object-detect
- Owner: Shaku-Med
- Created: 2025-06-19T00:57:09.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-06-19T02:11:24.000Z (about 1 year ago)
- Last Synced: 2025-06-19T03:25:45.366Z (about 1 year ago)
- Topics: license, mit-license, opencv, production-ready, python3, status, yolov4, yolov8
- Language: Python
- Homepage:
- Size: 14.6 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Advanced YOLO Object Detection System
[](https://www.python.org/)
[](https://opencv.org/)
[](https://github.com/AlexeyAB/darknet)
[](LICENSE)
[]()
> **Because life's too short for false positives – a YOLO detector that actually knows what it's looking at**
## What's This All About?
Ever get tired of object detection systems that think your coffee mug is a person? Or that insist your houseplant is definitely a car? Yeah, me too. This system is my attempt to build something that's not just fast, but actually *smart* about what it detects.
I've thrown in some pretty cool filtering techniques that help eliminate those "wait, that's obviously not a dog" moments. Think of it as YOLO with some common sense built in.
## The Three Flavors
I've got three different modes because, let's be honest, sometimes you just need something quick and dirty, and other times you want the full bells-and-whistles experience:
| Mode | What It Does | Speed | How Good It Is |
|------|-------------|-------|----------------|
| **Basic** | Your standard YOLO v4 – gets the job done | Lightning fast ⚡ | Pretty decent |
| **Advanced** | Adds some smart filtering magic | Still pretty quick | Much better |
| **Enhanced** | The whole shebang – all the fancy stuff | Takes its time | Chef's kiss 👌 |
## The Cool Stuff Under the Hood
Here's what makes this thing tick (warning: some of this might sound like I'm showing off, but I promise it's all useful):
- **Temporal Filtering**: Keeps track of what it's seen before – no more flickering detections
- **Motion Detection**: Only bothers looking at stuff that's actually moving (revolutionary, I know)
- **Ensemble Detection**: Runs multiple confidence checks because two heads are better than one
- **Multi-Scale Detection**: Looks at things from different angles – like putting on your reading glasses
- **Adaptive Thresholds**: Gets smarter over time (unlike me with my morning coffee)
- **Frame Quality Analysis**: Won't waste time on blurry garbage frames
- **Advanced NMS**: Fancy way of saying "don't detect the same thing twice"
- **Confidence Calibration**: Statistical mumbo-jumbo that actually works
## Getting Started (The Easy Way)
### What You'll Need
- Python 3.8+ (if you're still on Python 2, we need to talk)
- A webcam (or any video source that doesn't hate you)
- At least 4GB of RAM (8GB if you want the fancy enhanced mode)
- A computer that was made after 2010
### Installation
**Step 1: Grab the code**
```bash
git clone https://github.com/Shaku-Med/Real-Time-Object-Detect.git
cd advanced-yolo-detection
```
**Step 2: Install the dependencies**
```bash
pip install -r requirements.txt
```
*Grab a coffee while this runs. Trust me.*
**Step 3: Just run it**
```bash
python main.py
```
*It'll download what it needs automatically. I'm not a monster.*
## How to Use This Thing
### The "I Just Want It to Work" Approach
```bash
python main.py
```
This runs the enhanced mode with all the good stuff turned on. It's like the "I'm feeling lucky" button but for object detection.
### The "Let Me Choose My Own Adventure" Approach
```bash
python launcher.py
```
This gives you a nice menu where you can pick your poison.
### The "I Want to Code It Myself" Approach
**Basic Mode (for when you're in a hurry):**
```python
from detection_pipeline import LiveDetectionPipeline
pipeline = LiveDetectionPipeline(
weights_path="yolov4.weights",
config_path="yolov4.cfg",
classes_path="coco.names"
)
pipeline.run(camera_index=0)
```
**Advanced Mode (when you want something better):**
```python
from advanced_detection import AdvancedDetectionPipeline
pipeline = AdvancedDetectionPipeline(
weights_path="yolov4.weights",
config_path="yolov4.cfg",
classes_path="coco.names"
)
pipeline.run_advanced(camera_index=0)
```
**Enhanced Mode (when you want the full experience):**
```python
from enhanced_detection import EnhancedDetectionPipeline
pipeline = EnhancedDetectionPipeline(
weights_path="yolov4.weights",
config_path="yolov4.cfg",
classes_path="coco.names"
)
pipeline.run_enhanced(camera_index=0)
```
### Keyboard Shortcuts (Because We're Not Animals)
- **ESC** or **Q**: Peace out
- **R**: Start over (like Ctrl+Z for your detection pipeline)
- **S**: Save the current frame (for posterity)
- **Space**: Pause/Resume (for dramatic effect)
## Tweaking the Settings
Want to mess with the settings? Check out `config.py` – it's where all the magic numbers live:
```python
class Config:
# How confident should we be before yelling "I found something!"
CONFIDENCE_THRESHOLD = 0.6
NMS_THRESHOLD = 0.3
# Camera stuff
CAMERA_INDEX = 0
FRAME_WIDTH = 640
FRAME_HEIGHT = 480
FRAME_FPS = 30
# The fancy filtering stuff
TEMPORAL_FILTER_HISTORY = 15
TEMPORAL_STABILITY_THRESHOLD = 0.4
MOTION_THRESHOLD = 0.005
ENSEMBLE_THRESHOLDS = [0.3, 0.5, 0.7]
```
## How Everything Fits Together
The basic structure is pretty straightforward:
```
Advanced YOLO Detection System
├── main.py # The main event
├── launcher.py # For when you want options
├── config.py # All the knobs and dials
├── utils.py # Random useful stuff
├── detection_pipeline.py # Basic YOLO magic
├── advanced_detection.py # Smarter YOLO magic
├── enhanced_detection.py # The full monty
├── requirements.txt # What you need to install
└── README.md # You are here
```
### The Detection Pipeline (Or: How the Sausage Gets Made)
```
Your Camera Feed
↓
Multi-Scale Detection (looking at things from different angles)
↓
Temporal Filtering (remembering what we saw before)
↓
Motion Analysis (ignoring boring static stuff)
↓
Ensemble Detection (getting multiple opinions)
↓
Adaptive Thresholding (getting smarter over time)
↓
Quality Analysis (not wasting time on garbage frames)
↓
Advanced NMS (avoiding double-counting)
↓
Confidence Calibration (final sanity check)
↓
Your Beautifully Detected Objects
```
## Performance (The Numbers Game)
### How Much Better Is It?
| What We're Measuring | Basic | Advanced | Enhanced |
|---------------------|-------|----------|----------|
| **False Positives** | ~15% (meh) | ~8% (better) | ~3% (chef's kiss) |
| **How Stable** | Okay | Pretty good | Rock solid |
| **Speed** | 30 FPS | 25 FPS | 20 FPS |
| **Memory Usage** | Minimal | Reasonable | Hungry |
### What Your Computer Needs
| Component | Bare Minimum | What I'd Recommend |
|-----------|--------------|-------------------|
| **CPU** | Intel i5 / AMD Ryzen 5 | Intel i7 / AMD Ryzen 7 |
| **RAM** | 4GB (if you like living dangerously) | 8GB+ (for a good time) |
| **GPU** | Whatever you've got | NVIDIA GTX 1060+ |
| **Storage** | 2GB | 5GB+ (for all the models) |
## The Technical Stuff (For the Curious)
### Main Classes
**EnhancedDetectionPipeline** - The star of the show:
```python
class EnhancedDetectionPipeline:
def __init__(self, weights_path: str, config_path: str,
classes_path: str, confidence_threshold: float = 0.6,
nms_threshold: float = 0.3)
def run_enhanced(self, camera_index: int = 0,
window_name: str = "Enhanced Detection")
def detect_objects_enhanced(self, frame: np.ndarray) -> List[Detection]
```
**Detection** - What you get back:
```python
@dataclass
class Detection:
bbox: List[int] # Where the thing is [x, y, width, height]
confidence: float # How sure we are (0-1)
class_id: int # What number the thing is
class_name: str # What we call the thing
timestamp: float # When we found it
```
### The Supporting Cast
The system has a bunch of helper classes that do the heavy lifting:
- **MultiScaleDetector**: Looks at things from different zoom levels
- **AdaptiveThresholdManager**: Adjusts standards based on what it's seeing
- **FrameQualityAnalyzer**: Decides if a frame is worth processing
- **TemporalFilter**: Remembers what happened before
- **MotionDetector**: Spots the moving stuff
- **EnsembleDetector**: Gets multiple opinions before deciding
## Want to Contribute?
I'd love some help making this even better! Here's how to get started:
```bash
git clone https://github.com/Shaku-Med/Real-Time-Object-Detect.git
cd advanced-yolo-detection
pip install -r requirements.txt
python -m pytest tests/
```
Just keep it clean, use type hints (your future self will thank you), and write some tests. I'm not picky about much else.
## License
MIT License – do whatever you want with it, just don't blame me if your robot uprising uses this code.
## Shoutouts
- The YOLO v4 folks for making object detection not terrible
- The OpenCV team for doing the heavy lifting on computer vision
- The COCO dataset people for giving us something to detect
- My coffee maker for keeping me functional during development
---
*P.S. - If you find any bugs, please let me know. I promise I'll fix them eventually (probably after I finish my current Netflix series).*