If we look back, it’s amazing how many autonomous vehicles and ADAS systems have been developed with real-world data (and will probably continue to be used in combination with more accurate synthetic data), but real-world data limitations are becoming more and more evident, to the point of asking… Will future perception systems need real-world data for training at all?
What are the limits of real-world data?
Most probably, autonomous driving and other advanced perception systems will still need real-world data, but it won’t be enough. We are already starting to see this, even Tesla, with an immense amount of real data captured by their fleet of sold cars still has a growing synthetic data practice to complement it. But what are the reasons? What are the limits of real-world data?
Ground truth data is key to developing and validating an autonomous vehicle. Although it seems obvious, real-world data needs to be manually (or partially automatically) annotated which makes it hard to reach ground truth.
This, added to the need to annotate more and more complex data, makes manual annotation an impossible mission. Or put another way, it represents a risk for autonomous driving deep learning algorithms to understand and interpret the real-life situations they will face due to lack of ground truth data.
Let’s take an example. Waymo recently released this interesting research article about pedestrian intent detection and it stated how the manual labeling of these poses (to predict pedestrian behavior) is nothing but inaccurate. They found a way to get around this issue by using previously trained algorithms, which reveals a possible lack of data accuracy.
Annotating partially visible objects is another real-world-data-related issue for achieving accurate ground truth data.
Let me give you another everyday example… what would happen if an autonomous vehicle were to encounter a pedestrian behind another vehicle where the only visible part of their body was the upper torso while the legs were hidden? How would it react? Would its AI be able to identify if the pedestrian is standing still or is going to walk? If it’s partially annotated, probably not.
However, a synthetic dataset will be able to annotate the complete silhouette of that pedestrian, their legs would be “visible” as ground truth during training, meaning our AI could learn the features of a walking or standing pedestrian and be able to infer if someone intends to walk or just stand.
Another related issue would be annotating video footage. This is a complicated task where labeling what type of objects appear in the scene is no longer enough, a follow up throughout the entire footage is also required.
That is, if there is a vehicle at the beginning of the scene, with its unique identifier, the annotator would have to assign this unique identifier to that vehicle throughout the sequence. If the vehicle appears and disappears during the scene, the annotation becomes challenging…
A look into the future of data for training and validating autonomous vehicles
Sensors and AI deep learning models are evolving quicker and quicker, exposing the data shortcomings in current data simulators and highlighting the need for more accurate data which is able to match their new features and capabilities.
The new generation of sensors will expand the capabilities of “sensing” the environment, with greater precision:
On the other hand, AI deep learning models will be able to process more complex data, more semantic levels, behaviors-intentions, etc.
The intersection of sensor and deep learning models is the data you use to train those models. The more accurate and more faithful it is to the features of the sensor you are going to use in your autonomous vehicles perception system, the better your system will perform in the real-world.
As we saw above, getting accurate ground truth for the increasingly complex reality of autonomous vehicles is not easy. Generating the data you need, with accurate ground truth, may be the answer, but what about the sensor?
How can you make sure your synthetically generated data reproduces new sensors’ features and characteristics? Only a physically-based hyperspectral sensor simulation can give that level of accuracy. It’s not only the ground truth it’s the sensor as well.
The path to achieving trustworthy autonomous vehicles… goes through data accuracy
What happens when perception systems developers upgrade their cameras? Will they use their “legacy” real-world data to retrain the system? Will they capture new data? Or will they generate data simulating the new sensors?
It’s a complex question and there is no simple answer, but accurate synthetic data is gaining more and more weight to answer the latter question. It is something you have to explore and keep an eye on.
We may not know for certain what the limits of real-world data are, but we can be sure that there are no limits to pixel-accurate synthetic data.
With Anyverse™, you can accurately simulate any camera sensor and help you decide which one will perform better with your perception system. No more complex and expensive experiments with real devices, thanks to our state-of-the-art photometric pipeline.