An open API service indexing awesome lists of open source software.

https://github.com/dirvine/saorsa-robotics

Scaffold for training Hugging Face SO-101 robotic arms using Vision-Language-Action policies without demonstrations
https://github.com/dirvine/saorsa-robotics

Last synced: 8 months ago
JSON representation

Scaffold for training Hugging Face SO-101 robotic arms using Vision-Language-Action policies without demonstrations

Awesome Lists containing this project

README

          

# πŸ€– Saorsa Robotics

> **Production-ready Rust framework for autonomous robotic control with local AI models**

[![Rust](https://img.shields.io/badge/rust-1.75%2B-orange.svg)](https://www.rust-lang.org)
[![Safety](https://img.shields.io/badge/safety-critical-red.svg)](./crates/safety-guard)
[![Tests](https://img.shields.io/badge/tests-100%25-brightgreen.svg)](./crates)
[![License](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE)

Saorsa Robotics provides a comprehensive, safety-first framework for robotic control systems with Vision-Language-Action (VLA) models running entirely on local hardware. Built in Rust for memory safety, performance, and reliability.

## 🌟 Key Features

### 🧠 Local AI Models
- **MolmoAct Integration**: Action Reasoning Model with 3D spatial understanding and Chain-of-Thought planning
- **Candle ML Framework**: Lightweight Rust-native inference without Python dependencies
- **OpenVLA Support**: Compatible with cloud and local VLA models
- **On-Device Learning**: Continual improvement through OFT adapters and intervention learning

### πŸ›‘οΈ Safety-Critical Design
- **Formal Constraint System**: Expression-based DSL for defining safety boundaries
- **Real-Time Monitoring**: Watchdog systems with automatic intervention
- **Zero Panic Guarantee**: No `unwrap()`, `expect()`, or `panic!()` in production code
- **Comprehensive Testing**: 100% test coverage on safety-critical paths

### 🎯 Advanced Capabilities
- **Multi-Modal Control**: Voice commands, vision processing, and haptic feedback
- **Stereo Vision**: Depth perception with dual camera calibration
- **CAN Bus Integration**: Direct hardware control for motors and actuators
- **Intent Parsing**: Natural language to robot action conversion

### ⚑ Performance
- **Zero-Copy Data Paths**: Efficient memory usage
- **Async/Await**: Non-blocking I/O throughout
- **Action Chunking**: Smooth control despite network latency
- **Real-Time Capable**: Deterministic timing for critical operations

## πŸš€ Quick Start

### Prerequisites

```bash
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

# Clone repository
git clone https://github.com/dirvine/saorsa-robotics
cd saorsa-robotics
```

### Build and Test

```bash
# Build all crates
cargo build --release

# Run all tests
cargo test --all

# Run with safety checks
cargo run --bin sr-cli -- --safety-enabled
```

### Run Examples

```bash
# VLA Policy Demo
cargo run --example vla_policy_demo

# Wake Word Detection
cargo run --example wake_word_demo

# Safety Constraints Demo
cargo run --bin safety-demo
```

## πŸ“¦ Architecture

```
saorsa-robotics/
β”œβ”€β”€ apps/ # Application binaries
β”‚ β”œβ”€β”€ sr-cli/ # Main CLI interface
β”‚ β”œβ”€β”€ brain-daemon/ # Central coordination daemon
β”‚ β”œβ”€β”€ safety-demo/ # Safety system demonstration
β”‚ └── kyutai-stt-app/ # Speech-to-text application
β”œβ”€β”€ crates/ # Core library crates
β”‚ β”œβ”€β”€ vla-policy/ # Vision-Language-Action models
β”‚ β”œβ”€β”€ safety-guard/ # Safety constraint engine
β”‚ β”œβ”€β”€ voice-local/ # On-device voice processing
β”‚ β”œβ”€β”€ vision-stereo/ # Stereo vision and depth
β”‚ β”œβ”€β”€ intent-parser/ # NLU and command parsing
β”‚ β”œβ”€β”€ can-transport/ # CAN bus communication
β”‚ β”œβ”€β”€ device-registry/ # Hardware device management
β”‚ └── continual-learning/ # Online learning framework
β”œβ”€β”€ examples/ # Example applications
β”œβ”€β”€ configs/ # Device and system configs
└── docs/ # Technical documentation
```

## πŸ”§ Core Components

### VLA Policy System (`vla-policy`)

Implements multiple Vision-Language-Action models for robot control:

```rust
use vla_policy::{create_policy, PolicyConfig, Observation};

// Create MolmoAct policy with 3D reasoning
let config = PolicyConfig {
model_type: "molmoact".to_string(),
model_path: "models/molmoact-7b".to_string(),
// ... configuration
};

let policy = create_policy(config)?;
let action = policy.predict(&observation).await?;
```

**Features:**
- MolmoAct with Chain-of-Thought reasoning
- Waypoint generation for complex tasks
- Skills framework (Pick, Place, Reach)
- Mock policy for testing

### Safety Guard (`safety-guard`)

Expression-based constraint system ensuring safe operation:

```rust
use safety_guard::{SafetyGuard, Constraint};

let mut guard = SafetyGuard::new();

// Define workspace boundaries
guard.add_constraint(Constraint::expression(
"workspace_x",
"x >= -0.5 && x <= 0.5"
)?);

// Check if action is safe
if guard.check_action(&action)? {
robot.execute(action)?;
}
```

**Features:**
- Mathematical expression constraints
- Real-time evaluation with evalexpr
- Watchdog monitoring
- Automatic intervention on violations

### Voice Control (`voice-local`)

On-device speech recognition and wake word detection:

```rust
use voice_local::{KyutaiProvider, WakeWordDetector};

let provider = KyutaiProvider::new(config)?;
let detector = WakeWordDetector::new("hey robot")?;

// Process audio stream
if detector.detect(&audio_frame)? {
let command = provider.transcribe(&audio_buffer)?;
execute_command(command)?;
}
```

**Features:**
- Kyutai/Mimi model integration
- Real-time transcription
- Wake word detection
- Plugin architecture for custom models

### Stereo Vision (`vision-stereo`)

Depth perception and 3D scene understanding:

```rust
use vision_stereo::{StereoCamera, DepthEstimator};

let camera = StereoCamera::new(config)?;
camera.calibrate()?;

let (left, right) = camera.capture()?;
let depth_map = DepthEstimator::compute(&left, &right)?;
let tags = detect_april_tags(&left)?;
```

**Features:**
- Dual camera calibration
- Real-time depth estimation
- AprilTag detection
- Point cloud generation

### CAN Transport (`can-transport`)

Hardware control via CAN bus:

```rust
use can_transport::{SlcanTransport, Message};

let transport = SlcanTransport::new("/dev/ttyUSB0")?;

// Send motor command
let msg = Message::new(0x123, &[0x01, 0x02, 0x03])?;
transport.send(&msg)?;
```

**Features:**
- SLCAN protocol support
- ODrive motor control
- T-Motor actuator support
- Mock transport for testing

## πŸ§ͺ Testing

All crates maintain 100% test coverage on critical paths:

```bash
# Run all tests
cargo test --all

# Run with coverage
cargo tarpaulin --out Html

# Run safety-critical tests
cargo test -p safety-guard

# Run benchmarks
cargo bench
```

Current test status:
- βœ… `safety-guard`: 13/13 passing
- βœ… `vla-policy`: 21/21 passing
- βœ… `voice-local`: All doctests passing
- βœ… `intent-parser`: 1/1 passing
- βœ… Zero compilation warnings

## πŸ” Safety & Security

### Production Standards
- **No Panics**: Zero `unwrap()`, `expect()`, or `panic!()` in production
- **Error Handling**: All errors properly propagated with `Result`
- **Memory Safety**: Guaranteed by Rust's ownership system
- **Concurrency Safety**: Safe parallelism with Send/Sync traits

### Safety Features
- Formal constraint verification
- Watchdog timers on all operations
- Automatic failsafe modes
- Comprehensive audit logging

## πŸ“š Documentation

- [Architecture Overview](./docs/README.md)
- [VLA Policy Design](./docs/SPEC.md)
- [Safety System](./crates/safety-guard/README.md)
- [CAN Protocol](./docs/CAN.md)
- [Vision System](./docs/VISION.md)
- [Voice Control](./docs/VOICE.md)
- [Research Notes](./docs/RESEARCH.md)

## πŸ—ΊοΈ Roadmap

### Near Term
- [ ] ONNX runtime integration for broader model support
- [ ] ROS2 bridge for ecosystem compatibility
- [ ] Web dashboard for monitoring and control
- [ ] Simulation environment with Bevy

### Long Term
- [ ] Distributed multi-robot coordination
- [ ] Federated learning across robot fleets
- [ ] Custom silicon accelerator support
- [ ] Formal verification of safety properties

## 🀝 Contributing

We welcome contributions! Please see [CONTRIBUTING.md](./CONTRIBUTING.md) for guidelines.

Key areas for contribution:
- Additional VLA model implementations
- Hardware device drivers
- Safety constraint patterns
- Documentation and examples

## πŸ“„ License

MIT License - see [LICENSE](./LICENSE) for details.

## πŸ™ Acknowledgments

- Built with [Candle](https://github.com/huggingface/candle) for ML inference
- Inspired by [LeRobot](https://github.com/huggingface/lerobot) for robot learning
- Safety patterns from aerospace and automotive industries

## πŸ“¬ Contact

- GitHub: [@dirvine](https://github.com/dirvine)
- Project: [Saorsa Labs](https://saorsa.org)

---

*For the original Python implementation for SO-101 arms, see [archive/python-so101](./archive/python-so101/README.md)*