https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model
This Demo presents a machine learning-based steering module for sidewalk navigation . Using a dual-input EfficientNetV2 model, it processes RGB-D data from an Intel RealSense D415 to classify sidewalk scenarios and generate real-time steering commands. Optimized with OpenVINO
https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model
efficientnet efficientnetv2 machine-learning numpy opencv openvino python3 realsense-camera tensorflow
Last synced: 4 months ago
JSON representation
This Demo presents a machine learning-based steering module for sidewalk navigation . Using a dual-input EfficientNetV2 model, it processes RGB-D data from an Intel RealSense D415 to classify sidewalk scenarios and generate real-time steering commands. Optimized with OpenVINO
- Host: GitHub
- URL: https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model
- Owner: JavierKaiser9
- Created: 2025-03-03T19:10:58.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-03-04T09:00:41.000Z (11 months ago)
- Last Synced: 2025-03-04T10:19:45.742Z (11 months ago)
- Topics: efficientnet, efficientnetv2, machine-learning, numpy, opencv, openvino, python3, realsense-camera, tensorflow
- Language: Python
- Homepage:
- Size: 22.1 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🚀 Sidewalk Navigation Demo
**RGB-D Fusion with EfficientNetV2**
## 🌟 Overview
This repository demonstrates the **architecture and implementation of a dual-input, single-output deep learning model** for real-time **sidewalk navigation**. A key focus is to:
💡 Showcase the working principles of a **2-input, 1-output EfficientNetV2 model** for RGB-D fusion.
💡 Present the **performance of OpenVINO-optimized models**.

---
## 🛠️ Key Technologies
**Deep Learning Framework**: TensorFlow for model development and training
**Model Architecture**: **Dual-Input, Single-Output EfficientNetV2**
- Input: **RGB + Depth** (from Intel RealSense D415)
- Output: **Steering Command** (turn left, right, or go straight)

✅ **Hardware Acceleration**: **Intel OpenVINO 2023.2** for real-time inference
✅ **Depth Sensing**: **Intel RealSense D415** for **RGB-D fusion**
✅ **Performance Optimization**: Model converted to **OpenVINO IR format** for embedded deployment
✅ **Real-Time Execution**: Achieves a mean of **50 FPS** on an embedded system without GPU
✅ **Development Environment**:
- **TensorFlow**: 2.10
- **OpenCV (CV2)**: 4.8.0
- **Python**: 3.10
✅ **Training Hardware for Demo Model**: **NVIDIA GeForce RTX 3050**
---
#### 💻 Requirements
- **Intel RealSense D415** camera connected to your computer
- Python environment with **OpenVINO** installed
## 🚀 How to Use
This repository provides two main functionalities:
1️⃣ **Directly use the pre-trained OpenVINO model** for real-time inference.
2️⃣ **Train your own model** using the provided architecture.
### 🔹 1. Running the Pre-Trained OpenVINO Model
If you want to use the **pre-trained OpenVINO model**, clone the repository and run the test_openvino_models.py file.
### 🔹 2. Train your own OpenVINO Model
If you want to train your own model, change the paths in the train_two_input_one_output_model.py file to the locations where you want to store the training and test data. Then, you can transform the TensorFlow model into an OpenVINO model using the create_openvino_model.py file.
## 🎯 Performance Highlights
🔥 **High accuracy** in sidewalk scenario classification
⚡ Optimized for **low-latency execution** on edge devices