Bicycles, people, cars or sky, road, grass: Which pixels of an image depict unique foreground objects or people in front of a self-driving car, and which pixels depict background classes?
This so-called panoptic segmentation task is a fundamental problem that finds applications in several fields like robotics, self-driving cars, biomedical image analysis, and augmented reality. Dr Abhinav Valada, an Assistant Professor for Robot Learning and a member of BrainLinks-BrainTools, at the Department of Computer Science of the University of Freiburg focuses on this research question.
Valada and his colleagues designed the most advanced “EfficientPS” artificial intelligence (AI) model that allows coherent recognition of visual scenes effectively and more rapidly.
According to Valada, this task is mostly addressed by employing a machine learning method called deep learning, wherein artificial neural networks developed based on the human brain learn from huge amounts of data. Public benchmarks like Cityscapes play a vital role in quantifying the advancement in such methods.
For many years, research teams, for example from Google or Uber, compete for the top place in these benchmarks.
Rohit Mohan, Member of Valada’s Team, University of Freiburg
The technique developed by the computer scientists from Freiburg, which has been designed to perceive urban city scenes, has been graded first in Cityscapes, the most powerful leaderboard for research on scene understanding in the field of autonomous driving.
Moreover, EfficientPS constantly sets new standards for other established benchmark datasets like IDD, Mapillary Vistas, and KITTI.
Valada demonstrated examples of the way the researchers trained several AI models on various datasets. The findings are overlaid on the corresponding input image, wherein the colors indicate the object class to which the pixel is assigned by the model. For instance, people are marked in red, cars in blue, buildings in gray, and trees in green.
Furthermore, the AI model forms a border around every object that it believes is an individual entity. The Freiburg scientists have been successful in training the model to convert the learned information of urban scenes from Stuttgart to New York City. The AI model did not know how a city in the United States would appear, yet it was able to precisely identify the New York City scenes.
Earlier techniques for tackling this issue have large model sizes and are computationally costly for use in real-world applications like robotics that are highly resource-restricted.
Our EfficientPS not only achieves state-of-the-art performance, it is also the most computationally efficient and fastest method. This further extends the applications in which EfficientPS can be used.
Dr Abhinav Valada, Assistant Professor for Robot Learning, Member of BrainLinks-BrainTools, University of Freiburg