Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Awesome-Autonomous-Driving
https://github.com/autonomousdrivingkr/Awesome-Autonomous-Driving
Last synced: 1 day ago
JSON representation
-
Papers
-
Localization and Mapping
-
Lane Detection
-
Planning
-
Overall
-
Classification
-
2D Object Detection
- [Paper - faster-rcnn)
- [Paper-CVPR14 - arXiv14]](http://arxiv.org/pdf/1311.2524)
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - FCN)
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - faster-rcnn)
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
-
3D Object Detection
-
Object Tracking
-
Semantic Segmentation
- leaderboards
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper - Uncertainty)
- [Paper - Uncertainty)
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper-CVPR15 - arXiv15]](http://arxiv.org/pdf/1411.4038)
- [Paper
- [Paper
- [Paper-ICML12 - PAMI13]](http://yann.lecun.com/exdb/publis/pdf/farabet-pami-13.pdf)
- [Web
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
- [Paper
-
Depth Estimation
-
Visual Odometry
-
Decision Making
-
-
Dataset
-
RL in Autonomous Driving
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- LaRA - Traffic Lights Recognition (TLR) public benchmarks
- CALTECH Pedestrian Detection Benchmark - The Caltech Pedestrian Dataset consists of approximately 10 hours of 640x480 30Hz video taken from a vehicle driving through regular traffic in an urban environment. About 250,000 frames (in 137 approximately minute long segments) with a total of 350,000 bounding boxes and 2300 unique pedestrians were annotated. The annotation includes temporal correspondence between bounding boxes and detailed occlusion labels.
- Caltech Lanes Dataset - Caltech Lanes dataset includes four clips taken around streets in Pasadena, CA at different times of day.
- Udacity - ROSBAG training data. (~80 GB).
- KAIST, Complex Urban Dataset
- University of Michigan Ford Campus Vision and Lidar Data Set - Dataset collected by an autonomous ground vehicle testbed, based upon a modified Ford F-250 pickup truck. The vehicle is outfitted with a professional (Applanix POS LV) and consumer (Xsens MTI-G) Inertial Measuring Unit (IMU), a Velodyne 3D-lidar scanner, two push-broom forward looking Riegl lidars, and a Point Grey Ladybug3 omnidirectional camera system. Here we present the time-registered data from these sensors mounted on the vehicle, collected while driving the vehicle around the Ford Research campus and downtown Dearborn, Michigan during November-December 2009. The vehicle path trajectory in these datasets contain several large and small-scale loop closures, which should be useful for testing various state of the art computer vision and SLAM (Simultaneous Localization and Mapping) algorithms. The size of the dataset is huge (~100 GB) so make sure that you have sufficient bandwidth before downloading the dataset.
- Comma.ai - 7 and a quarter hours of largely highway driving. Consists of 10 videos clips of variable size recorded at 20 Hz with a camera mounted on the windshield of an Acura ILX 2016. In parallel to the videos, also recorded some measurements such as car's speed, acceleration, steering angle, GPS coordinates, gyroscope angles. These measurements are transformed into a uniform 100 Hz time base.
- Automated Synchronization of Driving Data: Video, Audio, Telemetry, and Accelerometer - 1,000+ hours of multi-sensor driving datasets collected at AgeLab(Lex Fridman).
- LISA: Laboratory for Intelligent & Safe Automobiles, UC San Diego Datasets - traffic sign, vehicles detection, traffic lights, trajectory patterns.
- BDD100K: A Diverse Driving Video Database with Scalable Annotation Tooling - Datasets drive vision progress and autonomous driving is a critical vision application, yet existing driving datasets are impoverished in terms of visual content. Driving imagery is becoming plentiful, but annotation is slow and expensive, as annotation tools have not kept pace with the flood of data. Our first contribution is the design and implementation of a scalable annotation system that can provide a comprehensive set of image labels for large-scale driving datasets. Our second contribution is a new driving dataset, facilitated by our tooling, which is an order of magnitude larger than previous efforts, and is comprised of over 100K videos with diverse kinds of annotations including image level tagging, object bounding boxes, drivable areas, lane markings, and full-frame instance segmentation. The dataset possesses geographic, environmental, and weather diversity, which is useful for training models so that they are less likely to be surprised by new conditions. The dataset can be requested at this http URL.
- Belgium Traffic Sign Dataset - Dataset for Belgium Traffic Sign Classification, Detection.
- Traffic Light in South Korea - In contrast to Europe and the USA, most TLs for vehicles in South Korea at intersections have a horizontal layout and are installed as side-pillar horizontal types. A TL can have three or four signals and one signal consists of a 355 mm x 355 mm black box with colored bulbs. The diameter of the bulb is 300 mm. There are two types of bulbs: a circle and an arrow. The circle bulb indicates green, red, and yellow, whereas the arrow bulb represents a left turn. There are two combinations for the three bulb TL, and there is one type for the four bulb TL. The TL status can be green, yellow, red, green + left turn, and red + left turn.
- Oxford Radar RobotCar Dataset - Provides Millimetre-Wave radar data, dual velodyne lidars, and optimised ground truth odometry for 280 km of driving around Oxford, UK (in addition to all sensors in the original [Oxford RobotCar Dataset](http://robotcar-dataset.robots.ox.ac.uk/))
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- Virtual KITTI - Virtual KITTI is a photo-realistic synthetic video dataset designed to learn and evaluate computer vision models for several video understanding tasks: object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation. Virtual KITTI contains 50 high-resolution monocular videos (21,260 frames) generated from five different virtual worlds in urban settings under different imaging and weather conditions. These worlds were created using the Unity game engine and a novel real-to-virtual cloning method. These photo-realistic synthetic videos are automatically, exactly, and fully annotated for 2D and 3D multi-object tracking and at the pixel level with category, instance, flow, and depth labels.
- University of Michigan Ford Campus Vision and Lidar Data Set - Dataset collected by an autonomous ground vehicle testbed, based upon a modified Ford F-250 pickup truck. The vehicle is outfitted with a professional (Applanix POS LV) and consumer (Xsens MTI-G) Inertial Measuring Unit (IMU), a Velodyne 3D-lidar scanner, two push-broom forward looking Riegl lidars, and a Point Grey Ladybug3 omnidirectional camera system. Here we present the time-registered data from these sensors mounted on the vehicle, collected while driving the vehicle around the Ford Research campus and downtown Dearborn, Michigan during November-December 2009. The vehicle path trajectory in these datasets contain several large and small-scale loop closures, which should be useful for testing various state of the art computer vision and SLAM (Simultaneous Localization and Mapping) algorithms. The size of the dataset is huge (~100 GB) so make sure that you have sufficient bandwidth before downloading the dataset.
- Virtual KITTI - Virtual KITTI is a photo-realistic synthetic video dataset designed to learn and evaluate computer vision models for several video understanding tasks: object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation. Virtual KITTI contains 50 high-resolution monocular videos (21,260 frames) generated from five different virtual worlds in urban settings under different imaging and weather conditions. These worlds were created using the Unity game engine and a novel real-to-virtual cloning method. These photo-realistic synthetic videos are automatically, exactly, and fully annotated for 2D and 3D multi-object tracking and at the pixel level with category, instance, flow, and depth labels.
- Mapillary Vistas Dataset - A diverse street-level imagery dataset with pixel‑accurate and instance‑specific human annotations for understanding street scenes around the world. 25,000 high-resolution images,152 object categories,100 instance-specifically annotated categories,Global reach, covering 6 continents, Variety of weather, season, time of day, camera, and viewpoint
- University of Michigan North Campus Long-Term Vision and LIDAR Dataset - long-term autonomy dataset for robotics research collected on the University of Michigan’s North Campus. The dataset consists of omnidirectional imagery, 3D lidar, planar lidar, GPS, and proprioceptive
- Cityscape Dataset - Large-scale dataset that contains a diverse set of stereo video sequences recorded in street scenes from 50 different cities, with high quality pixel-level annotations of 5 000 frames in addition to a larger set of 20 000 weakly annotated frames. Focused on developing Pixel Level Classification, Instance-wise Segmentation.
- Daimler Urban Segmetation Dataset - The Daimler Urban Segmentation Dataset consists of video sequences recorded in urban traffic. The dataset consists of 5000 rectified stereo image pairs with a resolution of 1024x440. 500 frames (every 10th frame of the sequence) come with pixel-level semantic class annotations into 5 classes: ground, building, vehicle, pedestrian, sky. Dense disparity maps are provided as a reference, however these are not manually annotated but computed using semi-global matching (sgm).
- GTSRB, GTSDB - Dataset for Traffic Sign Classification, Traffic Sign Detection.
-
-
Conference
-
Books
-
Videos
-
RL in Autonomous Driving
- Deep Learning, Self-Taught Learning and Unsupervised Feature Learning By Andrew Ng
- The Unreasonable Effectiveness of Deep Learning by Yann LeCun
- Deep Learning of Representations by Yoshua bengio
- ComputerVisionFoundation Video
- EENG 512 / CSCI 512 - Computer Vision - William Hoff (Colorado School of Mines)
- UCF CRCV
- Visual Object and Activity Recognition - Alexei A. Efros and Trevor Darrell (UC Berkeley)
- Computer Vision - Rob Fergus (NYU)
- Computer Vision: Foundations and Applications - Kalanit Grill-Spector and Fei-Fei Li
- Computer Vision - Steve Seitz (University of Washington)
- Multiple View Geometry Daniel Cremers (TU Munich):
- CUHK
- Oxford
- Season-1
- Season-2
- Season-1
- NYU
-
-
Software
Categories