https://github.com/znitche/piva

Raspberry Pi powered Vision Assistant
https://github.com/znitche/piva

objects-detection ocr python3 raspberry-pi text-to-speech vision-assistance

Last synced: 7 months ago
JSON representation

Raspberry Pi powered Vision Assistant

Host: GitHub
URL: https://github.com/znitche/piva
Owner: zNitche
License: mit
Created: 2023-12-06T15:02:02.000Z (over 1 year ago)
Default Branch: main
Last Pushed: 2024-04-30T17:51:22.000Z (about 1 year ago)
Last Synced: 2024-04-30T18:59:57.970Z (about 1 year ago)
Topics: objects-detection, ocr, python3, raspberry-pi, text-to-speech, vision-assistance
Language: Python
Homepage:
Size: 2.19 MB
Stars: 0
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

        # piVA

Raspberry Pi powered Vision Assistant

---

This project was aimed to be a little bit different, my intention was to create 

minimum viable product (`MVP`) but considering lack of time and errors at hardware

planning phase it turned to be more like proof of concept (`POC`).

Despite this, the goal of building this device with the parts I had on hand has been achieved.

Anyway, here is the device that can help blind people perform everyday activities.

### Features

 

- Integration with speech synthesizer (in this case espeak).

- Audio jack for headphones connection.

- Objects detection.

- OCR.

- Standalone power source + usb-c charging.

- Buttons for easier controls.

- Turn on/off switch.

### POC Missing Features

- Battery level monitoring.

- 3D printed case, that makes device mobile.

### Usage

2 bottom buttons are for navigation between available modules.

Top one is for executing currently selected module.

Every step (current module change, module executing) is described by speech synthesizer.

### Parts

![parts](.media/parts.jpg)

| Part Name                   | Quantity |

|-----------------------------|:--------:|

| RaspberryPi Zero 2          |    1     |

| prototype circuit board     |    2     |

| sd card                     |    1     |

| RaspberryPi Camera v2       |    1     |

| 5mm ARK connector           |    7     |

| TP4056 LiPol charger        |    1     |

| step up converter (2A)      |    1     |

| on/off switch               |    2     |

| LiPol bracket               |    1     |

| RPi GPIO Header             |    1     |

| 20 pin female connector     |    2     |

| button                      |    3     |

| jack connector              |    1     |

| 270 Ω resistor              |    2     |

| 150 Ω resistor              |    2     |

| 10uF electrolytic capacitor |    2     |

| 10nF polyester capacitor    |    2     |

### Assemble

![assembled](.media/assembled.jpg)

Audio output circuit (I skipped protection diodes)

![audio_output](.media/audio_circuit.png)

(source: https://learn.adafruit.com/adding-basic-audio-ouput-to-raspberry-pi-zero)

### Software

1. flash RPi with raspbian, setup `ssh` and `connectivity`

2. add/modify following lines to `/boot/config.txt`:

For audio output

```

dtoverlay=audremap,pins_12_13

disable_audio_dither=1

```

For camera

```

# Automatically load overlays for detected cameras

#start_x=1

dtoverlay=imx219

```

3. increase swap

```

sudo dphys-swapfile swapoff

sudo nano /etc/dphys-swapfile

CONF_SWAPSIZE=1024

sudo dphys-swapfile setup

sudo dphys-swapfile swapon

```

4. install required system packages

```

sudo apt install python3-pip espeak ffmpeg tesseract-ocr

```

5. install required python packages

```

pip3 install torch==2.1.1 torchvision==0.16.1 --index-url https://download.pytorch.org/whl/cpu --no-cache-dir

```

```

pip3 install -r requirements.txt

```

6. following command should be executed at device startup

```

python3 main.py

```

### Limitations

- objects detection is slow and lacks precision.

- OCR works good only when text is directly in front of camera.

- there is no case to fit device in and therefore it is not mobile.

### Possible Improvements

- replacing built in objects detector with faster and more reliable solution like `YOLOv5`.

- integrating more advanced ocr solution (filtering, cropping).

- adding battery monitoring system (I was going to use `DFRobot I2C Digital Wattmeter`).

- creating 3d printed case for device mobility.

- using newer RaspberryPi camera (v3 series).

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/znitche/piva

Awesome Lists containing this project

README