Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

awesome-rgbd-datasets

This repository contains information for the paper "A Survey on RGB-D Datasets" and is a collaborative initiative to update the datasets list faster.
https://github.com/alelopes/awesome-rgbd-datasets

our website
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
DDAD - H2 LIDAR |SOR, and SOE |Driving |Color, Deph |Instance Segmentation |150 scenes (12650 frames) |2020/2021 |
Woodscape - 64E |SOR, and SOE |Driving |Color, IMU, GPS, Depth |instance segmentation, 2d object detection |50 sequences (100k frames) |2019/2020/2021 |
NuScenes - |SOR, and SOE |Driving |Color, Depth, Radar, IMU |3D object detection, semantic segmentation |1000 scenes (20 seconds each). 1.4M Images and 390k Lidar Sweeps |2019/2020 |
EventScape - |Synthetic |SOR, and SOE |Driving |Color, Depth |Semantic Segmentation, Navigation Data (Position, orientation, angular velocity, etc) |758 sequences |2021 |
KITTI-360 - object detection, 3d-object detection, tracking, instance segmentation, optical flow. These are not in necessary in the same dataset|11 sequences to over 320k images and 100k laser scans |2021 |
Lyft level 5 - beam lidars), 5 radars, MVS |SOR, and SOE |Driving |Color, Depth, Radar |3d object detection |170000 scenes (25 seconds each) |2020 |
Virtual Kitti - |Synthetic |SOR, and SOE |Driving |Color, Depth |semantic segmentation, instance segmentation, optical flow |50 videos (21260 frames) |2016 |
KITTI
Hypersim - |Synthetic |SOR, and SOE |Indoor |Color, Depth |Normal Maps, Instance Segmentation, Diffuse Reflectance |461 scenes (77400 images) |2021 |
RoboTHOR - |Synthetic |SOR, and SOE |Indoor |Color, Depth |Instance Segmentation |75 scenes |2020 |
Structured3D Dataset - |Synthetic |SOR, and SOE |Indoor |Color, Depth |Object Detection, Semantic Segmentation |3500 scenes with 21835 rooms (196515 frames) |2020 |
Replica
Gibson
InteriorNet - |Synthetic |SOR, and SOE |Indoor |Color, Depth, IMU |Normal Maps, Semantic Segmentation |20 Million Images |2018 |
Taskonomy
AVD
MatterPort3D - voxel segmentation |90 scenes, 10800 panoramic views (194400 images) |2017 |
ScanNet - similar to Microsoft Kinect v1 |SOR, and SOE |Indoor |Color, Depth |3D semantic-voxel segmentation |1513 sequences (over 2.5 million frames) |2017 |
SceneNet RGB-D - |Synthetic |SOR, and SOE |Indoor |Color, Depth |instance segmentation, optical flow |15K trajectories (scenes) (5M images) |2017 |
SunCG - |Synthetic |SOR, and SOE |Indoor |Color, Depth |semantic segmentation |45622 scenes |2017 |
GMU Kitchen Dataset
Stanford2D3D - scale indoor areas (70496 images) |2016 |
SUN3D
Starter Dataset - the-wild |Color, Depth, IMU, Grayscale Camera (depending on subdataset)|Normals Maps, Semantic Segmentation, Scene Classification, etc. (depending on subdataset) |Over 14.6M Images (multiple scenes) |2021 |
RGBD Object dataset
TartanAir - |Synthetic LiDAR |SOR, and SOE |Indoor, Outdoor |Color, Depth |semantic segmentation, optical flow |1037 scenes (Over 1M frames). Each scene contains 500-4000 frames. |2020 |
RGB-D Semantic Segmentation Dataset
GTA-SfM Dataset - |Synthetic |SOR, and SOE |Outdoor |Color, Depth |Optical Flow |76000 images |2020 |
GL3D
ApolloScape - 64E S3 |SOR |Driving |Color, Depth, GPS, Radar | |155 min with 93k frames |2020 |
KAIST - 16, SICK LMS-511, MVS |SOR |Driving |Color, Depth, GPS, IMU, Altimeter | |19 sequences (191 km) |2019 |
RobotCar - 151 2D LIDAR, 1 x SICK LD-MRS 3D LIDAR |SOR |Driving |Color, Deph, GPS, INS (Inertial navigation system) | |133 scenes (almost 20M images (from multiple sensors) |2016 |
Malaga Urban Dataset
Omniderectional Dataset - 64E, MVS |SOR |Driving |Color, Depth |- |152 scenes (12607 frames) |2014 |
Ford Campus Vision and Lidar - 64E, MVS |SOR |Driving |Color, Depth, IMU, GPS | |2 sequences |2011 |
Karlsruhe - |20 sequences (16657 frames) |2011 |
Multi-FoV (Urban Canyon dataset) - |Synthetic |SOR |Driving, Indoor |Color, Depth |- |2 sequences |2016 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
BlendedMVS - |Synthetic |SOR |In-the-wild |Color, Depth | |113 scenes (17000 images) |2020 |
Youtube3D - |Two Points Automatically Annotated |SOR |In-the-wild |Color, Relative Depth | |795066 images |2019 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
4D Light Field Benchmark - field (Synthetic MVS) |SOR |In-the-wild |Color, Depth | |24 scenes |2016 |
Habitat Matterport (HM3D) - |1000 scenes |2021 |
ODS Dataset
360D
PanoSUNCG - |Synthetic |SOR |Indoor |Color, Depth | |103 scenes (25000 images) |2018 |
CoRBS
EuRoC MAV Dataset - |11 scenes |2016 |
Augmented ICL-NUIM Dataset - |Synthetic |SOR |Indoor |Color, Depth | |4 scenes (2 living room, 2 offices) |2015 |
Ikea Dataset - |7 scenes |2015 |
ViDRILO
ICL-NUIM dataset - |Synthetic |SOR |Indoor |Color, Depth | |8 scenes (4 living room, 4 office) |2014 |
MobileRGBD
RGBD Object dataset v2 - |14 sequences |2014 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
RGB-D Dataset 7-Scenes - |7 scenes (500-1000 frames/scene) |2013 |
Reading Room Dataset - |1 scene|2013|
TUM-RGBD
IROS 2011 Paper Kinect - |27 sequences |2011 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
M&M
Mannequin Challenge datasets
MVSEC Dataset
ETH3D - |25 high-res, 10 low-res|2017|
DiLigGent-MV Dataset
A Large Dataset of Object Scans
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
BigBIRD
Fountain Dataset
MVS
Live Color+3D Database - 400) |SOR |Outdoor |Color, Depth | |12 scenes |2011/2013/2017 |
The Newer College Dataset - 1 (Gen 1) 64|SOR|Outdoor|Color, Depth, IMU|-|6 scenes|2020|
Megadepth
CVC-13: Multimodal Stereo Dataset - |4 scenes |2013 |
Make3D - built 3-D scanner |SOR |Outdoor |Color, Depth | |534 images |2009 |
Fountain-P11 and Herz-Jesu-P8
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
DeMon
Scenes11 - |Synthetic |SOR | |Color, Depth | |19959 sequences |2017 |
VALID - |Synthetic |SOE |Aerial |Color, Depth |Object Detection, Panoptic Segmentation, Instance Segmentation, Semantic Segmentation |6 scenes (6690 images) |2020 |
US3D
Potsdam - |-|SOR|Aerial|Color, Depth|-|38 Patches|2011|
Vaihingen - ORION M|Only Depth|Aerial|Color, Depth|-|33 Patches|2011|
Leddar Pixset Dataset
Virtual Kitti 2 - |Synthetic |SOE and Tracking (Other) |Driving |Color, Depth |Semantic Segmentation, Instance Segmentation, Optical Flow |5 scenes (multiple conditions for each scene) |2020 |
Waymo Perception
Argoverse Dataset
CityScapes - object detection and pose |50 cities (25000 images) |2016 |
SYNTHIA - sequences) at 5 fps. 200k images from videos |2016 |
Daimler Urban Segmentation Dataset
Ground Truth Stixel Dataset
Daimler Stereo Pedestrian Dataset
UnrealDataset - |Synthetic |SOE |Driving, Outdoor |Color, Depth |Semantic Segmentation |21 sequences (100k images) |2018 |
OASIS v2 - |From Human Annotation |SOE |In-the-wild |Color, Depth |Normal Maps, Instance Segmenation |102000 images |2021 |
OASIS - |From Human Annotation |SOE |In-the-wild |Color, Depth |Normal Maps, Instance Segmenation |140000 images |2020 |
Scene Flow Datasets - |Synthetic |SOE |In-the-wild |Color |Optical Flow, object segmentation |2256 scenes (39049 frames) |2016 |
RGBD Salient Object Detection - the-wild |Color, Depth |Saliency Maps |1000 images |2014 |
Saliency Detection on Light Field - the-wild |Color, Depth |Saliency Maps |100 images |2014 |
MPI Sintel - |Synthetic |SOE |In-the-wild |Color, Depth |Optical Flow |35 scenes (50 frames/scene) |2012 |
NYUv2-OC++
Near-Collision Set
SUN_RGB-D
TUW - D |SOE |Indoor |Color, Depth |object instance recognition |15 sequences (163 frames) |2014 |
Willow and Challenge Dataset
NYU Depth V2 - D images |2012 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
Berkeley B3DO
NYU Depth V1 - D frames |2011 |
ClearGrasp - Synthetic |over 50000 synthetic images of 9 objects. 286 real images of 10 objects |2019 |
T-LESS
DROT
MPII Multi-Kinect
Mid-Air Dataset - |Synthetic |SOE |Outdoor |Color, Depth, Accelerometer, Gyroscope, GPS |Normal Maps, Semantic Segmentation |54 sequences (420,000 frames) |2019 |
SCARED Dataset - |9 sequences |2021 |
Colonoscopy CG dataset - |Synthetic |Medical |Endoscopy |Color, Depth | |16016 images |2019 |
Endoscopic Video Datasets
Name Not Defined - |Synthetic|Medical|Medical|Color, Depth|-|100 irises (72000 images)|2020|
50 Salads
RGB2Hands - hand distance, intra-hand distance |Real: 4 sequences (1724 frames). Synthetic: NA |2020 |
ObMan Dataset - |Synthetic |Gestures |Partial Body w/o Scene |Color, Depth |3D Hand Keypoints, Object Segmentation, Hand Segmentation |150000 images |2019 |
Name Not Defined - |Synthetic|Medical|Medical|Color, Depth|-|100 irises (72000 images)|2020|
BigHand2.2M
Pandora Dataset
RHD - |Synthetic |Gestures |Partial Body w/o Scene |Color, Depth |Segmentation, Keypoints |43986 images |2017 |
THU-READ - |1920 sequences |2017 |
STB
EYEDIAP
Eurecom Kinect Face Dataset
MANIAC Dataset - |103 sequences |2014 |
NYU Hand Pose Dataset
3DMAD
Dexter 1
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
MSR Gesture3D - |336 sequences |2012 |
Florence 3D Faces - |Synthetic |Gestures |Partial Body w/o Scene |Color | |53 people (NA Frames/Seqs) |2011 |
Espada Dataset - |Synthetic |Only Depth |Aerial |Color, Depth | |49 environments (80k images) |2021 |
DSEC Dataset - 16, MVS |Only Depth |Driving |Color, Depth, GPS | |53 sequences |2021 |
Mapillary
rabbitAI Benchmark - camera light-field (MVS) |Only Depth |Driving |Color | |200 scenes (100 for training, 100 for testing) |2020 |
DrivingStereo - Driving Stereo - 64E, MVS |Only Depth |Driving |Color, Depth, IMU, GPS | |42 sequences (182188 frames) |2019 |
Urban Virtual Dataset (UVD) - |Synthetic |Only Depth |Driving |Color, Depth | |58500 images |2017 |
DiverseDepth Dataset - the-wild |Color | |320000 images |2020 |
HRWSI - the-wild |Color, Depth | |20778 images |2020 |
Holopix50k - the-wild |Color |- |49368 images |2020 |
DualPixels Dataset - the-wild |Color, Depth |- |3190 images |2019 |
TAU Agent Dataset - |Synthetic |Only Depth |In-the-wild |Color, Depth | |5 scenes |2019 |
WSVD - the-wild |Color, Depth | |553 videos (1500000 frames) |2019 |
ReDWeb - the-wild |Color, Depth | |3600 images |2018 |
IRS Dataset - |Synthetic |Only Depth |Indoor |Color, Depth |Normal Maps |100025 images |2019/2021 |
IBims-1
AirSim Building_99 - |Synthetic |Only Depth |Indoor |Color, Depth |- |20000 images |2021 |
Pano3D Dataset
Multiscopic Vision
Middlebury 2014 Dataset
Middlebury 2006 Dataset - build Structured Light |Only Depth |Indoor |Color, Depth | |21 images |2006 |
Middlebury 2005 Dataset - build Structured Light |Only Depth |Indoor |Color, Depth | |9 images |2005 |
Middlebury 2003 Dataset - build Structured Light |Only Depth |Indoor |Color, Depth | |2 images |2003 |
Middlebury 2001 Dataset
DIML/CVL
DIODE
Forest Virtual Dataset (FVD) - |Synthetic |Only Depth |Outdoor |Color, Depth | |49500 images |2017 |
Zurich Forest Dataset - |3 sequences (9846 images) |2017 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
SQUID - |57 images |2020 |
No Name Defined - | Synthetic | SOR, and SOE | Aerial | Color, Depth | Normal Maps, Edges, Semantic Labels | 15 scenes (144000 images) | 2020 |
UOW Online Action3D
TVPR - |23 sequences (100 people, 2004 secs) |2017 |
TST Fall detection Dataset v2
UOW LargeScale Combined Action3D
TST Intake Monitoring dataset v1 - |48 sequences |2015 |
TST Intake Monitoring dataset v2 - |60 sequences |2015 |
TST TUG dataBase
UTD-MHAD
Human3.6M - D frames (almost 3.6M RGB frames) |2014 |
Northwestern-UCLA Multiview Action 3D Dataset - |1473 sequences |2014 |
TST Fall detection Dataset v1
Chalearn Multimodal Gesture Recognition
MHAD - |660 sequences |2013 |
ChaLearn gesture challenge - |50000 sequences |2012 |
DGait - |583 sequences (53 subjects) |2012 |
MSR DailyActivity3D
RGBD-ID
SBU Kinect Interaction
MSR Action3D - |557 sequences (23797 frames) |2010 |
Hollywood 3D - the-wild |Color, Depth |- |around 650 video clips |2013 |
Depth 2 Height
HHOI - 7 seconds presented at 10-15 fps|2016 |
CMU Panoptic Dataset
UR Fall Detection Dataset - |70 sequences |2014 |
RGB-D People
DIW - |Two points (manually anotated) |Points (Other) |In-the-wild |Color, Depth Points (2 points) | |495000 images |2016 |
LightField Dataset
Princeton Tracking Benchmark
FRIDA dataset - |Synthetic |Fog (Other) |Driving |Color, Depth | |18 scenes (90 images) |2010 |
FRIDA2 dataset - |Synthetic |Fog (Other) |Driving |Color, Depth | |66 scenes (330 images) |2012 |
Dynamic Scene
3D Ken Burns - |Synthetic |3D Ken Burns (Other) |In-the-wild |Color, Depth |Normal Maps |46 sequences |2019 |
Mirror3D Dataset - similar to Microsoft Kinect v1 |Mirror (Other) |Indoor |Color, Depth |Mirror Mask |7011 scenes with mirror |2021 |
SBM-RGBD dataset
LFSD - |Lytro light field (MVS) |SOE | In-The-Wild |Color, Depth |Saliency Mask |100 images |2015 |
An In Depth View of Saliency
DUTLF-Depth - |Lytro light field (MVS) |SOE | In-The-Wild |Color, Depth |Saliency Mask |1200 images |2019 |
ReDWeb-S - |MVS |SOE | In-The-Wild |Color, Depth |Saliency Mask |3179 images |2020 |
COTS - |Intel Realsense D435 (MVS) |SOE | Isolated Objects / Focussed on Objects |Color, Depth |Saliency Mask |120 images |2021 |
NTU RGB+D
NTU RGB+D 120
Mivia Action - |28 sequences |2013 |
Chalearn LAP IsoGD - |47933 sequences |2016 |
SYSU 3D HOI
G3D
IAS-Lab RGBD-ID
Online RGBD Action Dataset (ORGBD)
MAD
Hand Gesture - |1400 sequences |2014 |
Creative Senz3D - |1320 sequences |2015 |
PKU-MMD
Florence 3D Actions
UTKinect-Action3D
KARD
SOR3D-AFF - |1201 sequences |2020 |
CMDFALL - |20 sequences |2018 |
EgoGesture - |2081 sequences |2018 |
LIRIS - |180 sequences |2014 |
Bimanual Actions - |540 sequences |2020 |
ISR-UoL 3D Social Activity
UESTC - |25600 sequences |2018 |
DDAD - H2 LIDAR |SOR, and SOE |Driving |Color, Deph |Instance Segmentation |150 scenes (12650 frames) |2020/2021 |
![CC0

Programming Languages

Python 17 C++ 2 Jupyter Notebook 1

Keywords

dataset 3 cvpr2020 2 depth-estimation 2 monocular-depth-estimation 2 segmentation 1 3d-reconstruction 1 semantic-scene-understanding 1 accurate-occlusion-boundaries 1 saliency 1 inpainting 1 depth-estimator 1 displacement-field 1 nyuv2-oc 1 depth-prediction 1 generalization-on-diverse-scenes 1 computer-vision 1 single-image-depth-prediction 1 cuda 1 cupy 1 deep-learning 1 python 1 pytorch 1 blending 1