New AI Method Improves 3D Mapping with Less Power

Researchers from North Carolina State University developed a method that uses two-dimensional images taken by multiple cameras to improve the mapping of three-dimensional spaces by artificial intelligence (AI) programs. The technique shows promise for enhancing autonomous car navigation since it operates well with little computational power.

Most autonomous vehicles use powerful AI programs called vision transformers to take 2D images from multiple cameras and create a representation of the 3D space around the vehicle.

Tianfu Wu, Study Corresponding Author and Associate Professor, Department of Electrical and Computer Engineering, North Carolina State University

Wu continued, “However, while each of these AI programs takes a different approach, there is still substantial room for improvement. Our technique, called Multi-View Attentive Contextualization (MvACon), is a plug-and-play supplement that can be used in conjunction with these existing vision transformer AIs to improve their ability to map 3D spaces. The vision transformers aren’t getting any additional data from their cameras, they’re just able to make better use of the data.”

MvACon functions by adjusting the Patch-to-Cluster attention (PaCa) method that Wu and his associates published last year. Using PaCa, transformer AIs can identify objects in an image more quickly and accurately.

The key advance here is applying what we demonstrated with PaCa to the challenge of mapping 3D space using multiple cameras.

 Tianfu Wu, Study Corresponding Author and Associate Professor, Department of Electrical and Computer Engineering, North Carolina State University

The researchers tested MvACon's performance using three popular vision transformers: PETR, BEVFormer, and the BEVFormer DFA3D variant. Six distinct cameras provided 2D images to the vision transformers in each instance. MvACon dramatically enhanced each vision transformer's performance in all three cases.

Wu said, “Performance was particularly improved when it came to locating objects, as well as the speed and orientation of those objects. And the increase in computational demand of adding MvACon to the vision transformers was almost negligible.”

Our next steps include testing MvACon against additional benchmark datasets, as well as testing it against actual video input from autonomous vehicles. If MvACon continues to outperform the existing vision transformers, we’re optimistic that it will be adopted for widespread use.

Tianfu Wu, Study Corresponding Author and Associate Professor, Department of Electrical and Computer Engineering, North Carolina State University

Xianpeng Liu, a recent Ph.D. graduate of NC State, is the paper's first author. Zhebin Zhang and Chen Li of the OPPO U.S. Research Center, Ming Qian and Nan Xue of the Ant Group, and Ce Zheng and Chen Chen of the University of Central Florida co-authored the paper.

This work was supported by the National Science Foundation, the US Army Research Office, and a research gift fund from Innopeak Technology, Inc.

Tell Us What You Think

Do you have a review, update or anything you would like to add to this news story?

Leave your feedback
Your comment type
Submit

While we only use edited and approved content for Azthena answers, it may on occasions provide incorrect responses. Please confirm any data provided with the related suppliers or authors. We do not provide medical advice, if you search for medical information you must always consult a medical professional before acting on any information provided.

Your questions, but not your email details will be shared with OpenAI and retained for 30 days in accordance with their privacy principles.

Please do not ask questions that use sensitive or confidential information.

Read the full Terms & Conditions.