DataGAN: Leveraging Synthetic Data for Self-Driving Vehicles

The boxing match: generator vs discriminator

GANs, simply put, is essentially a type of unsupervised learning algorithm where a model learns to discover and learn patterns in its dataset in such a way that the model itself can create new, generative outputs that could look as real as the original dataset itself.

  1. The discriminator: the main goal behind the discriminator is to determine whether the image, created by the generator, is real or fake. This is where the term “adversarial” comes from in GANs → the generator and discriminator compete to see whether the generator can create images that can fool the discriminator.
An example of how GANs work. Source
You want both generator and discriminator to be even in strength!

Deep Convolutional Generative Adversarial Networks (DCGANs)

Rather than using straightforward Dense layers for generating + discriminating images, DCGANs leverage the use of Convolutional Neural Networks (CNNs) to accomplish this task. The TLDR; of CNNs is that Convolutional Neural Networks are essentially a method used to help break down images while capturing spatial + temporal dependencies an image has via its filters.

Original proposal for DCGANs implementation. Source
  • ReLU is used as the activation function for generators while LeakyReLU is used for the discriminator
  • Batch normalization is now a method that’s consistently used to help with gradient flow and avoid vanishing/exploding gradient problems
  • Tanh is the output function and should be used instead of something like sigmoid (normalization between [-1, 1] instead of [0, 1])


DataGAN leverages the Fully Convolutional Network (FCN) architecture along with the Deep Convolutional Generative Adversarial Network to create trainable synthetic data.


Current Progress

I’ve been training for around ~4000 (3875) epochs with this dataset (around 3075 images for training). Although I haven’t faced mode collapse (which is a good indicator so far!), here are some sample images for what it’s looking like.

Why I built DataGAN

GAN’s haven’t really found their place in self-driving yet. With DataGAN, I propose a segway for the introduction of GANs in self-driving and show that they can play a vital role in generating synthetic data.

More updates soon on Github.

Next steps

Right now, due to hardware constraints, I’m focusing on generating scene images from DataGAN. Next steps/potential pathways right now could be summarized as followed:

  • Create the first self-driving GAN dataset that can be used to train robust lane detection + computer vision models
  • Make an AC-GAN model where you could potentially input an edge case scenario (ex. overtaking a vehicle) and use GAN-ConvLSTM to help generate the given scene



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store