How simulating light and sensors help build better perception systems

How simulating light and sensors help build better perception systems

SHARE

How simulating light and sensors help build better perception systems

Developing computer vision systems is not an easy task. We are talking about systems that need to understand what they see in the real world and react accordingly. But, How do they see the world? How do you teach a machine what the real world is and interpret it?

Simply put, vision is the perception of light. The human eye, coupled with the brain, forms the most advanced perception system that exists to date. On the other hand, computer vision systems use optical cameras to perceive light (mimicking the eye), then use deep neural networks (mimicking the brain) to understand what they see. However, that “understanding” is limited today to specific problems like object detection, object segmentation, or depth estimation. We are still far from neural networks that can provide a full understanding of an image captured by a camera. Because of this limitation, some systems complement the perception with all kinds of other sensors like lidar and radar working with parts of the electromagnetic spectrum beyond visible light (IR and radio).

When it comes to self-driving cars ...

In the case of autonomous vehicles, there is still a heated debate whether optical cameras are enough for self-driving cars or other types of sensors are necessary.

 

Everybody wants to solve the same problem: Engineer vehicles that understand the world around them and can react accordingly, in any situation, for safe autonomous driving. SImplifying a lot, at the end of the day, solving the problem boils down to:

How simulating light and sensors help build better autonmous cars
Easier said than done. Getting data (and avoiding poor data) for a perception system is not easy. You have to take thousands of pictures and curate them, which requires infrastructure and organization, it can be a separate project in itself. If that is not enough, just images are not enough either for neural networks to learn. While training, you need to tell the neural network what it is seeing, and for that, you need to tag and annotate every single image with the ground truth data it needs for the specific problem. Very time-consuming and, often, not a very accurate task. When you thought you were done, it turns out your system is not performing well, so you need more training and yes, more data.

Synthetic data as a “real” alternative

An alternative to this endless loop is to use synthetic data. You create a synthetic 3D scenario and render thousands of images based on it adding automatic variability and generating all the ground truth data you need at the same time. Fair enough, problem solved. Not quite. When you train deep neural networks with synthetic data you have to make sure that they will be able to perform when facing the real world and understand it as well as the synthetic data. How well your network generalizes real-world images from synthetic images is key for your system’s success.
For that, you need to faithfully simulate the behavior of real cameras when generating synthetic images. The closer the images you use for training to the images the system is going to see to make decisions, the more accurate those decisions, less domain shift effects. This brings us back to the beginning of the article, vision is the perception of light. To faithfully simulate the behavior of cameras you need first to simulate light and then, follow its physical behavior throughout a scene as it is reflected, refracted, diffracted, and scattered by the objects and particles it finds in its way to the camera.

If you correctly characterized the light sources, including the sun and the sky, and every material in a 3D scene, you know exactly the amount of energy per wavelength reaching the camera sensor. With this spectral information, now you can simulate the physics in the sensor itself and how it transforms the energy in electrons and then into voltage that finally, after some digital processing, will give you an image as it was taken with the real camera.

Simulating light and sensors - Variability

Add a procedural engine to generate thousands of variations of the 3D scene, change camera position, lighting, and weather conditions. Leverage the processing power of the cloud to run everything in parallel and you have the Anyverse™ synthetic data platform. It features a proprietary physics-based synthetic image render engine.

 

It uses an accurate light transport model and provides a physics description of lights, cameras, and materials. Allowing for a very detailed simulation of the amount of light that is reaching the camera sensor and an equally detailed simulation of the sensor itself to produce the final color image.

No light, no perception, is that simple...

Why is this important? No light, no perception, is that simple. For us, no light, no simulation. And no simulation means your synthetic data may not be that useful to train and test deep learning-based perception systems. It may be more difficult for the neural networks to generalize to real-world images. Because at the end of the day that is every perception system’s goal: understand the real world and interpret it.

Different academic papers demonstrate that a machine learning model, based on deep neural networks, trained on a synthetic dataset generated considering camera sensor effects, performs in general better than if the effects are not present. You can check these papers on the subject:

Sensor simulation goes beyond data

Bear in mind that a faithful sensor simulation goes beyond data. If you are developing your own sensors you can make design decisions without the complexity and cost of prototyping on silicon. You can develop the best “eye-brain” combination for your perception problem without leaving the lab. It allows efficient agile practices from classic software development applied to software 2.0 development, a term coined by Andrej Karpathy in 2017 describing a paradigm change when developing deep learning-based systems.

Save time & costs - Simulate sensors!

Physically-based sensor simulation to train, test, and validate your computer perception deep learning model

About Anyverse™

Anyverse™ helps you continuously improve your deep learning perception models to reduce your system’s time to market applying new software 2.0 processes. Our synthetic data production platform allows us to provide high-fidelity accurate and balanced datasets. Along with a data-driven iterative process, we can help you reach the required model performance.

With Anyverse™ you can accurately simulate any camera sensor and help you decide which one will perform better with your perception system. No more complex and expensive experiments with real devices, thanks to our state-of-the-art photometric pipeline.

Need to know more?

Visit our website anyverse.ai anytime, our Linkedin, Facebook, and Twitter social media profiles.

Scroll to Top

Let's talk about synthetic data!