Can synthetic data (alone) train a robust object detection algorithm?
This study showed that synthetic data alone can train a robust object detection algorithm as benchmarked against real world-data.
They focused on the value of synthetic data to aid computer vision algorithms in their ability to automatically detect aircraft and their attributes in satellite imagery.
After conducting extensive experiments to evaluate the real and synthetic datasets and compare performances, the study proved that synthetic data is effective alone and in combination with few real data samples. For us the most interesting takeaway is that when a small subset of real data was added to fine-tune a model trained with the synthetic dataset, they observed a significant gain in mAP, leading to a performance that is on par with the model trained on the real dataset only.
Fine-tuning the model trained on synthetic data only with 10% of the observed dataset achieved roughly the same results as training on 100% of the observed dataset. This method would bypass 90% of the manual labeling and collection effort. If you think about it, it is a considerable cost cut since getting real images of planes from a satellite perspective is not easy nor cheap. But not only that, think about the time you are going to save too.
Can synthetic data perform as well as real data for object detection?
This study showed that synthetic data can adequately reduce reliance on real data, which is slow, expensive, and often difficult to procure. This opens opportunities for far more rapid and prolific adoption of computer vision technologies across industries.
Something to consider as well is that often the variability of real-world data is limited to what you can directly gather. That can make your model bias since you cannot provide enough diversity of conditions and scenarios at training time. With synthetic data, you can control the data you generate and make sure you cover all your use case needs, and avoid bias in your model.
We strongly believe that these experiments combining synthetic data with small amounts of real data and their results can apply to other use cases. What do you think?
Rest assured that as we find more evidence backing this assumption we will let you know.
Anyverse™ helps you continuously improve your deep learning perception models to reduce your system’s time to market applying new software 2.0 processes. Our synthetic data production platform allows us to provide high-fidelity accurate and balanced datasets. Along with a data-driven iterative process, we can help you reach the required model performance.
With Anyverse™ you can accurately simulate any camera sensor and help you decide which one will perform better with your perception system. No more complex and expensive experiments with real devices, thanks to our state-of-the-art photometric pipeline.