ISSN: 2167-7670
+44 1300 500008
Panagiotis Meletis
Holistic scene understanding is a vital component of the self-driving vehicles of the future. It is crucial that those vehicles are able to understand and interpret their environment in order to drive safely. This requires precise detection of surrounding objects (vehicles, humans, traffic objects, nature), discrimination between drivable and non-drivable surfaces (road, sidewalk, buildings) and segmentation of static and dynamic objects into high-level semantic classes. In the past, computer vision has tackled these problems separately due to their complexity and high computational needs. Nowadays, deep learning-based systems are trained on manually annotated datasets to solve these problems, however they face multiple challenges: 1) the number of the annotated semantic classes are limited by the available datasets to few dozen decreasing the variety of recognizable objects, 2) the density of annotations is inversely proportional to the size of the datasets, rendering huge dataset incompatible for precise segmentation, and 3) detection and segmentation are solved separately, that leads to higher memory and computational demands. Our research addresses the aforementioned challenges by proposing new methods to: 1) train a single network on multiple datasets with different semantic classes and different type of annotations, and 2) solve simultaneously with a single network the problems of detection and semantic segmentation. We have deployed those networks in our autonomous driving car with real-time performance. We demonstrate state-of-the-art results, together with a fivefold increase in the number of recognizable classes, and we integrate efficiently detection and segmentation into a joint panoptic segmentation system, taking important steps towards achieving holistic scene understanding.
Published Date: 2020-12-23;