https://github.com/themightyoarfish/deepvo

:video_camera: Tensorflow implementation of RCCN visual odometry by Wang et al.
https://github.com/themightyoarfish/deepvo

deep-learning python robotics tensorflow

Last synced: 10 months ago
JSON representation

:video_camera: Tensorflow implementation of RCCN visual odometry by Wang et al.

Host: GitHub
URL: https://github.com/themightyoarfish/deepvo
Owner: themightyoarfish
Created: 2018-03-07T10:44:04.000Z (over 8 years ago)
Default Branch: master
Last Pushed: 2019-03-22T09:32:46.000Z (over 7 years ago)
Last Synced: 2025-05-30T06:26:02.200Z (about 1 year ago)
Topics: deep-learning, python, robotics, tensorflow
Language: Python
Homepage:
Size: 5.02 MB
Stars: 52
Watchers: 5
Forks: 17
Open Issues: 0
Metadata Files:
- Readme: Readme.md

Awesome Lists containing this project

README

# Status
Not functional (i.e. does not converge with our data). But can be a useful starting point since the paper author's code is not public.

---

# A TensorFlow implementation of _DeepVO: Towards End-to-End Visual Odometry with Deep Recurrent Convolutional Neural Networks_

This is our submission for the ANN with TensorFlow course, winter 2017. **Please note that this implementation does not seem entirely correct. Convergence was observed only on a dataset with random moves forwards and backwards, without rotation.**

## Data Acquisition
In order to make use of the full 720 resolution of the LifeCam 3000, you must do two things
- Tell the device driver to use this resolution via `v4l2-ctl --set-fmt-video=width=1280,height=720,pixelformat=1` (the pixel format is probably not important, but you may need to adjust the ros node accordingly)
- In the `usb_cam_node`, set height and width parameters appropriately.

## Data Preprocessing
### Bagfile conversion
The first thing to do is to convert the rosbag sensor recordings with the
conversion tool (which you can find [here](https://github.com/themightyoarfish/bag_to_pose_cam_data)) like this
```
bag_to_cam_pose_data -b .bag -d -x -P
```
The `-P` flag is to dump one npy file for each image and pose. The `-x`
flag is for writing float image arrays instead of uint8.
This will create `images` and `poses` folders inside the chosen directory.
### Further preprocessing
Use the `preprocess_data.py` script to prepare the data for our network
* with `-d ` you give it the path where the `images/` and
`poses/` folders are located. *All modifications are done in-place*
* `-f` will map the images to (0, 1)
* `-m` will subtract the mean (over the entire set) from each image
* `-p` will add Pi to all pose angles. The robot's EKF output is in the
range (-pi, pi), but we want (0, 2pi)

## Potential Problems
- We are not sure if the timestamps of pose and camera messages are correct and thus whether the training data is good enough
- We have no control over the exposure time of the camera. Auto-exposure differences while driving around might make the problem more difficult

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/themightyoarfish/deepvo

Awesome Lists containing this project

README