Suraj Pattar

Robotics AI Research Engineer / Ph.D. in AI and Robotics / Data Scientist / 3D Printing Enthusiast

Synthetic Data Set for Autonomous Driving

Keywords: Autonomous Driving, Synthetic Data Generation, Domain Randomization, Sim2Real, Unreal Engine

Project Description

Get data on Kaggle

This is a Synthetic Dataset of rear-side view of vehicles with random background images (Domain Randomization). It can be used for car detection, 6D Pose Estimation, Autonomous Driving, etc..

The dataset contains 5000 RGB images and their corresponding depth image, instance segmentation image and class segmentation image along with a json file which contains information about position of cars in the images with respect to the camera. You can visualize the 6D bounding boxes around the cars using NVDU. In total there are 20002 files. There are two files describing the camera position in the world and its parameters. The background is randomized for every image. This is called Domain Randomization. It helps any model you train using this data to adapt to any other background not seen during training. Also it helps bridge the gap between simulation and reality.

This dataset was created using NDDS Plugin from NVIDIA.

This dataset is a sample of how synthetic data can be generated quickly for autonomous driving research. This was my Master Thesis Project at GV Lab, Tokyo University of Agriculture and Technology, Tokyo.


Ubuntu 20(Linux), Windows 10


Unreal Engine 4


NDDS Toolkit

Dataset Sample