https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model

This Demo presents a machine learning-based steering module for sidewalk navigation . Using a dual-input EfficientNetV2 model, it processes RGB-D data from an Intel RealSense D415 to classify sidewalk scenarios and generate real-time steering commands. Optimized with OpenVINO
https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model

efficientnet efficientnetv2 machine-learning numpy opencv openvino python3 realsense-camera tensorflow

Last synced: 4 months ago
JSON representation

Host: GitHub
URL: https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model
Owner: JavierKaiser9
Created: 2025-03-03T19:10:58.000Z (11 months ago)
Default Branch: master
Last Pushed: 2025-03-04T09:00:41.000Z (11 months ago)
Last Synced: 2025-03-04T10:19:45.742Z (11 months ago)
Topics: efficientnet, efficientnetv2, machine-learning, numpy, opencv, openvino, python3, realsense-camera, tensorflow
Language: Python
Homepage:
Size: 22.1 MB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# 🚀 Sidewalk Navigation Demo
**RGB-D Fusion with EfficientNetV2**

## 🌟 Overview

This repository demonstrates the **architecture and implementation of a dual-input, single-output deep learning model** for real-time **sidewalk navigation**. A key focus is to:
💡 Showcase the working principles of a **2-input, 1-output EfficientNetV2 model** for RGB-D fusion.
💡 Present the **performance of OpenVINO-optimized models**.

Realsense 415-D

---

## 🛠️ Key Technologies
**Deep Learning Framework**: TensorFlow for model development and training
**Model Architecture**: **Dual-Input, Single-Output EfficientNetV2**
- Input: **RGB + Depth** (from Intel RealSense D415)
- Output: **Steering Command** (turn left, right, or go straight)

Architecture

✅ **Hardware Acceleration**: **Intel OpenVINO 2023.2** for real-time inference
✅ **Depth Sensing**: **Intel RealSense D415** for **RGB-D fusion**
✅ **Performance Optimization**: Model converted to **OpenVINO IR format** for embedded deployment
✅ **Real-Time Execution**: Achieves a mean of **50 FPS** on an embedded system without GPU
✅ **Development Environment**:
- **TensorFlow**: 2.10
- **OpenCV (CV2)**: 4.8.0
- **Python**: 3.10
✅ **Training Hardware for Demo Model**: **NVIDIA GeForce RTX 3050**

---

#### 💻 Requirements
- **Intel RealSense D415** camera connected to your computer
- Python environment with **OpenVINO** installed

## 🚀 How to Use

This repository provides two main functionalities:
1️⃣ **Directly use the pre-trained OpenVINO model** for real-time inference.
2️⃣ **Train your own model** using the provided architecture.

### 🔹 1. Running the Pre-Trained OpenVINO Model
If you want to use the **pre-trained OpenVINO model**, clone the repository and run the test_openvino_models.py file.

### 🔹 2. Train your own OpenVINO Model
If you want to train your own model, change the paths in the train_two_input_one_output_model.py file to the locations where you want to store the training and test data. Then, you can transform the TensorFlow model into an OpenVINO model using the create_openvino_model.py file.

## 🎯 Performance Highlights
🔥 **High accuracy** in sidewalk scenario classification
⚡ Optimized for **low-latency execution** on edge devices

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/javierkaiser9/rgb-d_dual_input_machine_learning_model

Awesome Lists containing this project

README